Unix Shell as a language

More than a command line tool

Intro

There are a lot of introductions online to using Bash or other similar Linux/Unix shells. In my experience, these guides mostly show a minimum needed to navigate the file system and run some commands. Due to this, my initial impression of Shell was that it was nothing more than this; A place to navigate the file system and run some commands. Similar to what you can do from the “normal” (i.e. graphical) desktop, but using commands instead of clicking around with a mouse. I later realised that the Shell is much, much more than that. It doesn’t just run some commands, it’s really a language, like any programming language, but optimised to use as a user interface to your operating system.

What I know, is only what I learned through experience and reading, but I hope with this article to provide a better framework to understand what a shell like Bash can do.

What’s a shell?

When we talk about a shell in this context, we talk about an interface you see on the screen who allow you to interact with the computer. This can be a graphical interface, like your typical desktop environment (think Gnome, KDE, Mate, Cinnamon…). Or it can be a text based interface. In this article we’ll talk about the typical text interface which we’ll refer to as Unix Shell, or simply Shell. There are different variations of this Shell, and different implementations of these variations. The currently most known one is probably Bash. In this article, I’ll try to keep examples more general so they can work in different variations of the Unix Shell.

Getting started

First we open a shell. If you’re using Linux and don’t use a graphical desktop environment, you’re probably already in a Shell. On KDE, you can open “Konsole”, otherwise you can see if you find something called “Terminal” or “Console” or something. On OSX it’s called “Terminal”. On Windows you can use WSL, install Git Bash, or install a proper operating system instead. Note that Git Bash is just a very simple compatibility layer and will only allow rather shallow interactions with the OS itself. These are actually Shell emulators. They are GUI tools, who give you something that looks like a textual Shell.

First steps

When you have your Shell open, things can look slightly different based on how the Shell is configured. Typically you will see a dollar-sign “$” where you can start typing. Often, but not always, you will see some extra information like user, host, and directory. The background is often either black or white.

$ _

Let’s start how most guides start and type a command. Type pwd and press Enter.

$ pwd
/home/me

pwd is a typical command and it prints out the location where you currently are in the file system. This is a first important observation. The operating system you’re working in has a file system, and you are always located somewhere in that file system. In this example I use /home/me, but it will probably be the home folder of the account you’re logged in with.

Let’s try a different command, echo.

$ echo look, I am Shelling!
look, I am Shelling!

Here we didn’t just call a command, we also added some extra things after it. The Shell interprets these as parameters for the command you typed. Here we called echo and provided it with the parameters “look,”, “I”, “am”, and “Shelling!”. What echo does is just print out the parameters it got.

Variables and substitution

Alright, let’s do something fun now. While most people see the Shell as a place to run some commands, it’s actually a programming language and environment. Just one who is very tightly coupled to your operating system. That means we can do typical programming language things. One example is variables. Let’s store a value in a variable and then view the content of it.

$ me="Look it is me!"
$ echo $me
Look it is me!

First we store the string “Look it is me!” into the variable me, then we use echo to print it out. But there’s a lot of new things going on here! Let’s check them out.

First of all, we are using quotes, yet we do not see the quotes in the value of the me variable. The thing is that not all characters are taken literally as is. Some characters have special meaning. This is true for double quotes, single quotes, backtics, the dollar-sign, several types of brackets, and a whole bunch of other characters. Over time, when using the Shell a lot, you’ll start getting a feeling for good practices, but for now it suffices that you know this is a thing.

Secondly, we see that we use echo $me. But why use echo, and what’s with the dollar sign? When you type something as-is, the Shell will try to make sense of it. Maybe it’s a keyword, maybe it’s a special character, maybe it’s a command. If we would just type Look it is me!, then it will try to figure out what Look means in this context, assume it’s supposed to be a command it doesn’t know, and throw an error. So instead, we use echo and let that command print it out. So far so good, but we didn’t actually give the words “Look it is me!” to echo, did we? Well… yes we did! That’s where the dollar sign comes in. When we use the dollar sign in front of some text, the Shell will interpret this as a variable name and substitute it before it does something with it. So when we do echo $me, the Shell will first replace $me with the content of the variable “me”, then it will interpret the whole as it is now, which means it executes echo Look it is me!.

But there’s another thing going on. Why use the double quotes when assigning the string to the variable? And why don’t we use the double quotes with echo? The thing is that, by default, a space is considered a separator. Quotes, however, tell the Shell that the text shouldn’t be seen as separate words. When you try me=Look it is me!, the Shell will consider “me=Look”, “it”, “is”, and “me!” all separate things and throw an error because it assumes it is supposed to be a command it can’t find. We’re gonna go a bit deeper in how the Shell interprets this exact line a bit later, but for now, we just want to understand what the quotes do. We want to store the whole sentence as a single variable, so we need a way to tell the Shell that this is one value, and we do that by quoting the sentence.

When you try echo Look it is me! and echo "Look it is me!", you wont see a difference in behaviour. There is a difference, though. In the first case, echo will see these as different parameters, and just print them out one by one, space separated. In the second case, we only give one parameter to echo. In this case, both ways just happens to provide the exact same output. One way to see the difference between using quotes or not, is to use multiple spaces. Try it! If you add multiple spaces, then those will be seen as a single separator, and the output for the version without quotes will only have one space between the words. In the case with the quotes, the spaces are an integral part of the variable that we provide, and as such will be printed in the output.

$ echo Look it is me!
Look it is me!
$ echo "Look it is me!"
Look it is me!
$ echo Look   it   is   me!
Look it is me!
$ echo "Look   it   is   me!"
Look   it   is   me!

You can also try this with single quotes and you’ll see the same result as with double quotes. The difference between the two is that the Shell will still do the substitution within double quotes, but not for single quotes. Being able to substitute and still have quotes can be important. Observe the example below, do you understand what is happening?

$ echo Look it is me!
Look it is me!
$ echo "Look it is me!"
Look it is me!
$ echo 'Look it is me!'
Look it is me!

$ me='Look   it   is   me!'
$ echo $me
Look it is me!
$ echo "$me"
Look   it   is   me!
$ echo '$me'
$me

$ you="and you"
$ me="Look it is me $you"
$ echo "$me"
Look it is me and you

$ you="and you"
$ me='Look it is me $you'
$ echo "$me"
Look it is me $you

If you find it a bit confusing to predict how the quotes will be interpreted, remember that the Shell will always first replace the variable, then execute the line.

Procedures and executables

As said before, the Shell we’re talking about is more than just typing commands, it’s a programming language. In modern programming languages, we generally have three big paradigms; procedural programming, object oriented programming, and functional programming. Our Shell can be considered a procedural programming language. You tell it line by line what you want to see happening, but you can also write procedures that you can call with or without parameters. Let’s give it a go!

$ do_something() {
  echo "we are doing something"
}
$ do_something
We are doing something

Here we wrote a procedure called do_something, and then call it. The procedure just prints the sentence “we are doing something”. We can also pass parameters to our procedure, so let’s play with that a bit.

$ do_something() {
  echo "$1"
}
$ do_something 'Am I doing this o_o'
Am I doing this o_o

As you can see, we do not define the parameters when defining the procedure, we just get them passed in the form “<dollar-sign><number>”. We also see that we start with $1 and not $0. Why is that? Let’s see!

$ do_something() {
  echo "$0"
}
$ do_something 'What happens now?'
/bin/sh

The output you see can be slightly different, but what we see here is the command that was called to start the current process. In our case, that’s the command used to start the Shell. This isn’t really useful here, but can be useful in executables. Let’s play with that a bit.

In the beginning of this article, we used the command pwd to see where in the file system we are. First we’ll move somewhere else so we can play there a bit without bothering the rest of our system. Let’s do the following

$ mkdir /tmp/learning_shell
$ cd /tmp/learning_shell
$ pwd
/tmp/learning_shell

The /tmp folder is a temporary folder which, on most systems, gets wiped when starting the computer. By moving here, we can play around, create folders and files here, and after a reboot, we wont have files polluting our system. Great! Now let’s do what we wanted to do, play with some executable scripts. Let’s do the following

$ echo '#!/bin/sh' > a_script.sh
$ echo 'echo "It works!"' >> a_script.sh
$ chmod +x a_script.sh
$ ./a_script.sh
It works!

Few, that’s a whole bunch of new stuff to explain ^^’ Firstly, we see characters we haven’t used yet, > and >>. To really understand what these are, we need to understand file descriptors, but for now you can interpret these as “take the output and put this in the following file, and create the file if it doesn’t exist”. A single > will first clear the file if there’s already content, the >> will append to the file as a new line. The third line shows a command we haven’t used before, chmod. Files have content, but there’s more to files than just the content. There’s also metadata. One of the metadata is what you can do with a file. There are three things you can do with a file, you can read from it, you can write to it, and you can execute it. chmod allows you to change what is allowed to do with a file. In this case we say that the file “a_script.sh” may be executed. In the fourth line, we actually execute the file. Note that we don’t just give the name of a file, we prepend it with “./”. But before we go deeper into that, let’s analyse the content of the file.

The first two lines are quite clear I guess. It’s just the echo command where a parameter is provided. When this line is executed, we expect that to be printed out to the terminal, but here we instead write it to the file “a_script.sh”. So our file will have two lines.

We already understand the second line we write to the file, it’s an echo command. But what about the first line? There’s actually two things happening here. Firstly, as said before, the Shell is actually a programming language. As such, it’s not surprising it also has a way to add comments. The # at the start of a line tells the Shell it should consider this a comment. So when the script is executed by the Shell, it just ignores that line. But why is it there then? The answer is quite ingenius. We ask our shell to execute the file, but there are many ways to execute a file. A file containing machine code may need it’s instructions to be provided to the processor directly. Meanwhile, a script, and Shell is a scripting language, needs a specific program, called an interpreter, to run. We can tell the Shell what program to use, by starting our file with #!, followed by the absolute path of the interpreter to use. So first the Shell will read #!/bin/sh to conclude that it needs to use the program /bin/sh to run the file. Then /bin/sh will read the file and simply ignore this first line because it considers it a comment. Quite clever, isn’t it. Note that this also means that we don’t need the “.sh” extension in the filename. It’s often used to give a hint to the person using the script, but it will work just the same if you just name the file “a_script”. Another thing that we can show here; Do you remember how parameters are past using the “$<number>” notation? Well, the $0 shows how an executable was called. When we tried it with a function, it showed what command was used to start the executable it was started from, and that was the Shell. But here we call an executable. This will run in its own process and $0 will have the value of how the script was called. Observe:

$ echo '#!/bin/sh' > a_script
$ echo 'echo "param 0: $0"' >> a_script
$ echo 'echo "param 1: $1"' >> a_script
$ chmod +x a_script
$ ./a_script "Does it work?"
param 0: ./a_script
param 1: Does it work?

The PATH and environment variables

But why do we precede the name of the script with “./”? There are two ways to call an executable. You can either specify them by typing their path, or you can give the name of an executable located in what is called the PATH. Note that when we use the term path (lowercase letters) and PATH (uppercase letters), we mean two different things. A file has a location in the filesystem hierarchy and we represent that location by using what we call a path. In case of the script we made, the so called “full path”, also known as “absolute path”, is /tmp/learning_shell/a_script.sh. Another way to call the script by it’s path, is by using the relative path. The Unix Shell has several shorthand notations, for example the dot ., double dots .., and tilde ~. The dot represents the current path. In our case that’s /tmp/learning_shell. The double dots represent the parent folder of the directory we are in. In our case, .. represents /tmp. Note that these are representations, they are not substitutions like we saw with the dollar sign $. The tilde ~ represents the home folder of the user as whom we are running the process.

When we use the term PATH, with uppercase letters, we mean something different. The PATH is an ordered list of directories containing executables who may be called directly by their name. The PATH is defined in the variable PATH by a colon separated list. Try it!

$ echo "$PATH"
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

Your PATH may look slightly different but most of these folders are typically present, especially “/bin” and “/sbin” should be there. When we just type the name of a command, the Shell will look in several places to see if it finds what you ask to execute. This can be a procedure you defined (which we saw earlier), it could be a build-in command, or it can be an executable that the Shell finds in your PATH. It will start looking in the first folder, and keep looking until it finds a match. This means we can overwrite what our Shell will use by placing things in different directories, or even by expanding the PATH. Let’s experiment a bit.

$ echo '#!/bin/sh' > a_script
$ echo 'echo "This is running"' >> a_script
$ chmod +x a_script
$ a_script
a_script: command not found
$ OLD_PATH="$PATH"
$ PATH="/tmp/learning_shell"
$ a_script
This is running
$ PATH="$OLD_PATH"
$ a_script
a_script: command not found

Here we completely changed the PATH, but normally you would just add a directory to the PATH. For example, if you want a new directory and you want it to be checked first, you can do PATH="/tmp/learning_shell"":$PATH".

Using variables like this is often used to pass certain parameters to programs. By default a variable is only accessible in the process you run, but you can use export to make it accessible to child processes as well. Such variables are generally referred to as “environment variables”. Whether parameters can be passed like this, depends on how the executable is implemented that you want to use them for.

$ echo '#!/bin/sh' > a_script
$ echo 'echo "$MY_VAR"' >> a_script
$ chmod +x a_script
$ MY_VAR=something
$ ./a_script

$ echo "$MY_VAR"
something
$ export MY_VAR
$ ./a_script
something

Another way to pass an environment variable to a process, is to precede it when calling the command to run. It will then pass it to the process, but not set it for itself. We can also unset a variable, which locally clears the variable, and thus also remove that it was exported.

$ echo "$MY_VAR"
something
$ ./a_script
something
$ unset MY_VAR
$ echo "$MY_VAR"

$ ./a_script

$ MY_VAR=something
$ echo "$MY_VAR"
something
$ ./a_script

$ MY_VAR=looky_here ./a_script
looky_here
$ echo "$MY_VAR"
something

Type, Keywords and Structures

We already saw procedures and executables who both execute things. There are also keywords. One example where they are used is for the structures that you typically have in programming languages, like if, while, and case.

if true
then
  echo blop
fi
case blop in 
  blop)
    echo 'this is the blop case'
    ;;
  blub)
    echo 'I am a fish'
    ;;
  *) 
    echo 'this is something else'
    ;; 
esac

To know what type a certain command is, you can use the type command. The most typical types are shell builtin, keyword, or executable file.

$ type echo
echo is a shell builtin
$ type if
if is a shell keyword
$ type chmod
chmod is /usr/bin/chmod

test and arithmetic operations

We can also do tests who resolve to true or false, and we can do arithmetic. Let’s use it in a while structure, using the while keyword.

number=0

while [ $number -lt 10 ]
do
  echo "$number is less than 10"
  number=$((1+$number))
done
echo "We are done, $number is not less than 10"

Let’s write a program!

We’ve seen a lot of concepts that go way beyond simply commands. To prove that Shell is an actual programming language, let’s write a small program.

Paste the following in a file, make it executable, and run it! The program doesn’t have any error handling at the moment, feel free to add some yourself!

Note that this uses some commands and notations we haven’t mentioned before. Can you understand what they do?

#!/bin/sh

echo "Let's play a guessing game!"
echo "I'll think of a number from 0 to 9, you guess what the number is."
echo "Let's go!"
echo ""

number_to_guess=$(cat /dev/urandom | base64 | tr -cd '0-9' | head -c 1)
running=true

while $running
do
echo "Ayt, what number am I thinking of?"
read guessed_number

if [ $guessed_number -lt $number_to_guess ]
then
echo 'Nooo, think bigger! The number I have in mind is bigger than that!'
fi

if [ $guessed_number -gt $number_to_guess ]
then
echo 'Whooo, slow down! The number I have in mind is not so big!'
fi

if [ $guessed_number -eq $number_to_guess ]
then
echo "Yes, you found it! The number I was thinking of is indeed $number_to_guess""! How did you know ^^'"
running=false
fi
done

echo ''