23,041 reads

Let's Build a Linux Shell [Part I]

by Mohammed IsamJune 8th, 2020

Too Long; Didn't Read

The first Unix shell (the Thompson shell) had very limited features, mainly I/O redirection and command pipelines. We'll see how a Linux shell manages to execute commands, loops, and conditional expressions. At the end of this tutorial, we’ll have a basic Linux shell that will not do much for now, but which we'll expand and improve in the next parts. The core part of any Linux shell is the Command Line Interpreter, or CLI. This part serves two purposes: it reads and parses user commands, then executes the parsed commands.

Companies Mentioned

featured image - Let's Build a Linux Shell [Part I]

Since the early days of Unix, the shell has been part of the user's interface with the operating system. The first Unix shell (the Thompson shell) had very limited features, mainly I/O redirection and command pipelines. Later shells expanded on that early shell and added more and more capabilities, which gave us powerful features that include word expansion, history substitution, loops and conditional expressions, among many others.

Why This Tutorial

Over the past 20 years, I've been using GNU/Linux as my main operating system. I've used many GNU/Linux shells, including but not limited to bash, ksh, and zsh. However, I've always been bugged by this question: what makes the shell tick? Like, for example:

How does the shell parse my commands, convert them to executable instructions, and then perform these commands?
How does the shell perform the different word expansion procedures, such as parameter expansion, command substitution, and arithmetic expansion?
How does the shell implement I/O redirection?
... and so on.

As most GNU/Linux shells are open-sourced, if you want to learn the inner workings of the shell, you can search online for the source code and start digging in (that's what I actually did). But this advice is actually easier said than done. For example, where exactly should you start reading the code from? Which source files contain the code that implements I/O redirection? Where can I find the code that parses user commands? I guess you got the point.

This is why I’ve decided to write this tutorial, to help Linux users and programmers gain a better understanding of their shells. Together, we are going to implement a fully functional Linux shell, from scratch. Along the way, we'll see how a Linux shell manages to parse and execute commands, loops, and conditional expressions by actually writing the C code that does the above tasks. We’ll talk about word expansions and I/O redirection, and we’ll see the code that performs features.

By the end of this tutorial, we’ll have a basic Linux shell, that will not do much for now, but which we’ll expand and improve in the next parts. At the end of this series, we’ll have a fully functional Linux shell that can parse and execute a fairly complex set of commands, loops, and expressions.

What You Will Need

In order to follow this tutorial, you will need the following:

A working GNU/Linux system (I personally use Ubuntu and Fedora, but feel free to use your favorite Linux distribution).
GCC (the GNU Compiler Collection) to compile the code.
A text editor to write the code (I personally use GEdit, but you can use Vim, Emacs, or any other editor as well).
How to program in C.

I'm not going to dive into the details of installing the required software here. If you are not sure how to get your system running any of the above software packages, please refer to your Linux distribution's documentation and make sure you have everything set up before going further.

Now let's get down to business. We’ll start by having a bird’s eye view of what constitutes a Linux shell.

Components of a Linux Shell

The shell is a complex piece of software that contains many different parts.

The core part of any Linux shell is the Command Line Interpreter, or CLI. This part serves two purposes: it reads and parses user commands, then it executes the parsed commands. You can think of the CLI itself as having two parts: a parser (or front-end), and an executor (or back-end).

The parser scans input and breaks it down to tokens. A token consists of one or more characters (letters, digits, symbols), and represents a single unit of input. For example, a token can be a variable name, a keyword, a number, or an arithmetic operator.

The parser takes these tokens, groups them together, and creates a special structure we call the Abstract Syntax Tree, or AST. You can think of the AST as a high level representation of the command line you gave to the shell. The parser takes the AST and passes it to the executor, which reads the AST and executes the parsed command.

Another part of the shell is the user interface, which usually operates when the shell is in the interactive mode, for example, when you are entering commands at the shell prompt. Here the shell runs in a loop, which we know as the Read-Eval-Print-Loop, or REPL.

As the loop's name indicates, the shell reads input, parses and executes it, then loops to read the next command, and so on until you enter a command such as

exit

shutdown

, or

reboot

Most shells implement a structure known as the symbol table, which the shell uses to store information about variables, along with their values and attributes. We'll implement the symbol table in part II of this tutorial.

Linux shells also have a history facility, which allows the user to access the most recently entered commands, then edit and re-execute commands without much typing. A shell can also contain builtin utilities, which are a special set of commands that are implemented as part of the shell program itself.

Builtin utilities include commonly used commands, such as

cd

fg

, and

bg

. We'll implement many of the builtin utilities as we move along with this tutorial.

Now that we know the basic components of a typical Linux shell, let's start building our own shell.

Our First Shell

Our first version of the shell won't do anything fancy; it will just print a prompt string, read a line of input, then echo the input back to the screen. In subsequent parts of this tutorial, we’ll add the capability to parse and execute commands, loops, conditional expressions, and much more.

Let's start by creating a directory for this project. I usually use

~/projects/

for my new projects, but feel free to use whatever path you're comfortable with.

The first thing we'll do is to write our basic REPL loop. Create a file named

main.c

(using

touch main.c

), then open it using your favorite text editor. Enter the following code in your main.c
file:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include "shell.h"

int main(int argc, char **argv)
{
    char *cmd;

    do
    {
        print_prompt1();

        cmd = read_cmd();

        if(!cmd)
        {
            exit(EXIT_SUCCESS);
        }

        if(cmd[0] == '\0' || strcmp(cmd, "\n") == 0)
        {
            free(cmd);
            continue;
        }

        if(strcmp(cmd, "exit\n") == 0)
        {
            free(cmd);
            break;
        }

        printf("%s\n", cmd);

        free(cmd);

    } while(1);

    exit(EXIT_SUCCESS);
}

Our

main()

function is quite simple, as it only needs to implement the REPL loop. We first prints the shell's prompt, then we read a command (for now, let's define a command as an input line ending with

\n

). If there's an error reading the command, we exit the shell. If the command is empty (i.e. the user pressed

ENTER

without writing anything), we skip this input and continue with the loop.

If the command is

exit

, we exit the shell. Otherwise, we echo back the command, free the memory we used to store the command, and continue with the loop. Pretty simple, isn't it?

Our

main()

function calls two custom functions,

print_prompt1()

and

read_cmd()

. The first function prints the prompt string, and the second one reads the next line of input. Let’s have a closer look at those two functions.

Printing Prompt Strings

We said that the shell prints a prompt string before reading each command. In fact, there are five different types of prompt string: PS0 , PS1, PS2, PS3, and PS4. The zeroth string, PS0, is only used by bash, so we won’t consider it here. The other four strings are printed at certain times, when the shell wants to convey certain messages to the user.

In this section, we’ll talk about PS1 and PS2. The rest will come later on when we discuss more advanced shell topics.

Now create the source file

prompt.c

and enter the following code:

#include <stdio.h>
#include "shell.h"

void print_prompt1(void)
{
    fprintf(stderr, "$ ");
}

void print_prompt2(void)
{
    fprintf(stderr, "> ");
}

The first function prints the first prompt string, or PS1, which you usually see when the shell is waiting for you to enter a command. The second function prints the second prompt string, or PS2, which is printed by the shell when you enter a multi-line command (more on this below).

Next, let’s read some user input.

Reading User Input

Open the file

main.c

and enter the following code at the end, right after the

main()

function:

char *read_cmd(void)
{
    char buf[1024];
    char *ptr = NULL;
    char ptrlen = 0;

    while(fgets(buf, 1024, stdin))
    {
        int buflen = strlen(buf);

        if(!ptr)
        {
            ptr = malloc(buflen+1);
        }
        else
        {
            char *ptr2 = realloc(ptr, ptrlen+buflen+1);

            if(ptr2)
            {
                ptr = ptr2;
            }
            else
            {
                free(ptr);
                ptr = NULL;
            }
        }

        if(!ptr)
        {
            fprintf(stderr, "error: failed to alloc buffer: %s\n", strerror(errno));
            return NULL;
        }

        strcpy(ptr+ptrlen, buf);

        if(buf[buflen-1] == '\n')
        {
            if(buflen == 1 || buf[buflen-2] != '\\')
            {
                return ptr;
            }

            ptr[ptrlen+buflen-2] = '\0';
            buflen -= 2;
            print_prompt2();
        }

        ptrlen += buflen;
    }

    return ptr;
}

Here we read input from stdin in 1024-byte chunks and store the input in a buffer. The first time we read input (the first chunk for the current command), we create our buffer using

malloc()

. For subsequent chunks, we extend the buffer using

realloc()

. We shouldn’t encounter any memory issues here, but if something wrong happens, we print an error message and return

NULL

. If everything goes well, we copy the chunk of input we’ve just read from the user to our buffer, and we adjust our pointers accordingly.

The final block of code is interesting. To understand why we need this block of code, let’s consider the following example. Let’s say you want to enter a really, really long line of input:

echo "This is a very long line of input, one that needs to span two, three, or perhaps even more lines of input, so that we can feed it to the shell"

This is a silly example, but it perfectly demonstrates what we’re talking about. To enter such a long command, we can write the whole thing in one line (as we did here), which is a cumbersome and ugly process. Or we can chop the line into smaller pieces and feed those pieces to the shell, one piece at a time:

echo "This is a very long line of input, \
      one that needs to span two, three, \
      or perhaps even more lines of input, \
      so that we can feed it to the shell"

After typing the first line, and to let the shell know we didn’t finish our input, we terminate each line with a backslash character

\\

, followed by newline (I also indented the lines to make them more readable). We call this escaping the newline character. When the shell sees the escaped newline, it knows it needs to discard the two characters and continue reading input.

Now let’s go back to our

read_cmd()

function. We were discussing the last block of code, the one that reads:

        if(buf[buflen-1] == '\n')
        {
            if(buflen == 1 || buf[buflen-2] != '\\')
            {
                return ptr;
            }

            ptr[ptrlen+buflen-2] = '\0';
            buflen -= 2;
            print_prompt2();
        }

Here, we check to see if the input we’ve got in the buffer ends with

\n

and, if so, if the

\n

is escaped by a backslash character

\\

. If the last

\n

is not escaped, the input line is complete and we return it to the

main()

function. Otherwise, we remove the two characters (

\\

and

\n

), print out PS2, and continue reading input.

Compiling the Shell

With the above code, our niche shell is almost ready to be compiled. We’ll just add a header file with our function prototypes, before we proceed to compile the shell. This step is optional, but it greatly improves our code readability, and prevents a few compiler warnings.

Create the source file

shell.h

, and enter the following code:

#ifndef SHELL_H
#define SHELL_H

void print_prompt1(void);
void print_prompt2(void);

char *read_cmd(void);

#endif

Now let’s compile the shell. Open your favorite terminal emulator (I test my command line projects using GNOME Terminal and Konsole, but you can as well use XTerm, other terminal emulators, or one of Linux’s virtual consoles). Navigate to your source directory and make sure you have 3 files in there:

Now compile the shell using the following command:

gcc -o shell main.c prompt.c

If everything goes well,

gcc

should not output anything, and there should be an executable file named

shell

in the current directory:

Now invoke the shell by running

./shell

, and try entering a few commands:

In the first case, the shell prints PS1, which defaults to

and a space. We enter our command,

echo Hello World

, which the shell echoes back to us (we’ll extend our shell in part II to enable it to parse and execute this — and other—simple commands).

In the second case, the shell again echoes our (slightly long) command. In the third case, we split the long command into 4 lines. Notice how every time we type a backslash followed by

ENTER

, the shell prints PS2 and continues to read input. After the last line is entered, the shell amalgamates all the lines, removing all escaped newline characters, and echoes the command back to us.

To exit from the shell, type

exit

, followed by

ENTER

And that’s it! We’ve just finished writing our very first Linux shell. Yay!

What's Next

Although our shell currently works, it doesn’t do anything useful. In the next part, we’ll fix our shell to make it able to parse and execute simple
commands.

Stay tuned!