Self-contained Linux applications with lone lisp

I started the lone lisp project to create a lisp language and environment exclusively for Linux. I built into it support for arbitrary Linux system calls so that it would be possible to implement any program without need for any external dependencies and so that every Linux feature would be available to programmers.

Although far from ready for production use, I have achieved a significant milestone in the development of the language: it is now possible to create self-contained, standalone, freestanding, redistributable Linux applications written entirely in lisp.

Lisp code can now be embedded directly into a copy of the lone interpreter which may then be copied to and run on any Linux system of the same architecture, unmodified and without any external dependencies. The application is limited only by the system calls it uses: newer system calls will naturally require newer kernels.

In this article I will demonstrate this capability, explain how it works and the journey to implementing it. Every script bundling tool I've ever seen unpacks the code to some file system location and then reads it back in. I came up with a different solution.

A simple implementation of `env` in lone lisp

The env utility is among the simplest programs available in Unix-like operating systems. It's most fundamental function is to print the user's environment. That function can be trivially implemented in lone lisp:

(import (lone print) (linux environment))

(print environment)

Running this simple program produces a table of environment variables and their values:

$ ./lone < env.ln
{ "HOME" "/home/matheus" "EDITOR" "vim" … }

Embedding `env.ln` into the interpreter

In order to transform env.ln into a self-contained env program, the lone lisp code must be embedded into a copy of the interpreter. This can be achieved by the purpose-built lone-embed tool:

$ cp lone env
$ lone-embed env env.ln

The interpreter will then seamlessly load and execute the code:

$ ./env
{ "HOME" "/home/matheus" "EDITOR" "vim" … }

Elven segments

Tracing the system calls of the application with strace reveals an interesting property:

$ strace env
execve("env", ["env"], 0x7fe9752d40 /* 31 vars */) = 0
write(1, "{ ", 2)                                  = 2
write(1, "\"", 1)                                  = 1
write(1, "HOME", 4)                                = 4
write(1, "\"", 1)                                  = 1
write(1, " ", 1)                                   = 1
…

It begins writing its output immediately. There was no need for it to read from the file system. The code must be somewhere else.

$ xxd env | tail -n7
00032fe0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00032ff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00033000: 7b20 7275 6e20 2830 202e 2036 3329 207d  { run (0 . 63) }
00033010: 0a28 696d 706f 7274 2028 6c6f 6e65 2070  .(import (lone p
00033020: 7269 6e74 2920 286c 696e 7578 2065 6e76  rint) (linux env
00033030: 6972 6f6e 6d65 6e74 2929 0a0a 2870 7269  ironment))..(pri
00033040: 6e74 2065 6e76 6972 6f6e 6d65 6e74 290a  nt environment).

It's at the very end of the executable itself, at an aligned offset. Through some elven magic the interpreter is reflecting upon its own contents at runtime. It's loading the code from inside itself and evaluating it. Without using any system calls to do it.

The source of this magic is the ELF segment.

$ readelf --segments env | grep 33000
LOAD           0x0000000000033000 0x0000000000312000 0x0000000000312000
LOOS+0xc6f6e65 0x0000000000033000 0x0000000000312000 0x0000000000312000

ELF segments, also called program headers, describe the memory image of the program. There are many types of segments but especially interesting are the LOAD segments which tell Linux which parts of the file must be mapped to which addresses in order for the program to work correctly.

The lone-embed tool copies the lisp code into the ELF and then creates a LOAD segment for it. Linux then maps in the embedded code automatically at load time before the interpreter has even started.

Finding the segment

The code will be in memory but its location and size are unknown. It could be anywhere inside the vastness of virtual address space. To that problem Linux provides an ingenious solution: it tells us where it is.

In addition to argument and environment vectors, processes receive the auxiliary vector. It's essentially a null terminated array of key-value pairs of various types and it's placed right there on the stack just after the environment vector.

struct lone_auxiliary_value {
    union {
        char *c_string;
        void *pointer;
        long integer;
        unsigned long unsigned_integer;
    } as;
};

struct lone_auxiliary_vector {
    long type;
    struct lone_auxiliary_value value;
} *auxv;

void **pointer = (void **) envp;
while (*pointer++ != 0);
auxv = (struct auxiliary_vector *) pointer;

Through this mechanism, Linux passes various bits of useful information to programs. These include things like processor architecture and capabilities, current user and group IDs, some random bytes, the location of the vDSO, the system's page size...

And the location of the program header table.

struct lone_auxiliary_value lone_auxiliary_vector_value(struct lone_auxiliary_vector *auxiliary, long type)
{
    for (/* auxiliary */; auxiliary->type != AT_NULL; ++auxiliary)
        if (auxiliary->type == type)
            return auxiliary->value;

    return (struct lone_auxiliary_value) { .as.integer = 0 };
}

struct lone_elf_segments lone_auxiliary_vector_elf_segments(struct lone_auxiliary_vector *auxv)
{
    return (struct lone_elf_segments) {
        .entry_size  = lone_auxiliary_vector_value(auxv, AT_PHENT).as.unsigned_integer,
        .entry_count = lone_auxiliary_vector_value(auxv, AT_PHNUM).as.unsigned_integer,
        .segments    = lone_auxiliary_vector_value(auxv, AT_PHDR).as.pointer
    };
}

The table is just a contiguous array of ELF program header structures. Given this pointer, all the program has to do is scan the table and find the correct entry. One does not simply search for LOAD entries, however. Attempting to do it uncovers a couple of problems.

The first problem is the fact there are lots of these loadable segments and they're all indistinguishable from one another. ELF sections have unique names for identification, program headers have nothing.

The second problem is their alignment requirements. The addresses and sizes are usually aligned to page boundaries. This obfuscates the true size of the data they contain.

The lone segment

In order to solve this, I created my own custom segment type: the LONE segment.

ELF provides an incredibly generous numeric range for operating system-specific segments. All the values between LOOS and HIOS are free for operating systems to allocate. Those definitions represent a range of integers between 0x60000000 and 0x6FFFFFFF inclusive. That's 268,435,455 magic numbers.

So I just picked one. That's what the LOOS+0xc6f6e65 means. Spelled out lone in ASCII and it just worked itself out.

//      PT_LONE   l o n e
#define PT_LONE 0x6c6f6e65

I figured if GNU can do it then I can do it too.

The LONE segment is not loadable and thus does not have any alignment requirements, allowing it to describe the embedded segment exactly. It also serves as a magic number which makes it trivial to search for it in the program header table. Once found, it contains all the information lone needs to load and execute the code.

typedef Elf64_Phdr lone_elf_segment; // if 64 bit
typedef Elf32_Phdr lone_elf_segment; // if 32 bit

lone_elf_segment *lone_auxiliary_vector_embedded_segment(struct lone_auxiliary_vector *auxv)
{
    struct lone_elf_segments table = lone_auxiliary_vector_elf_segments(auxv);

    for (size_t i = 0; i < table.entry_count; ++i) {
        lone_elf_segment *entry = &table.segments[i];

        if (entry->p_type == PT_LONE)
            return entry;
    }

    return 0;
}

lone_elf_segment *segment = lone_auxiliary_vector_embedded_segment(auxv);
struct lone_bytes data;
data.count = segment->p_memsz;
data.pointer = (unsigned char *) segment->p_vaddr;

The interpreter now has the address and size of the data embedded into its own executable. At this point it's smooth sailing.

All that's left to do is to make sense of the bytes. I chose to simply put a descriptor object in there and have the interpreter read it in. Seemed like the simplest possible solution.

{ run (0 . 63) }

Just a good old hash table in lone lisp syntax. The run key contains the offset and size of the code the interpreter should run, relative to the end of the descriptor object. It just reads in that slice and evaluates it.

Since it's just a normal hash table, it's also infinitely extensible with arbitrary schemas. I plan to implement a modules key to contain an arbitrary number of lone modules so that programs can statically link their libraries into the lone interpreter. All I gotta do is place the embedded segment into the module search path, before all the others. I suppose I could also allow configuring the interpreter via the descriptor object, eliminating the need for command line switches.

Special linker features

Remember the lone-embed tool which I briefly mentioned at the beginning of the articled? It's an ELF patching tool which I built specifically for this purpose and it's doing a lot of heavy lifting here.

When programs are compiled and linked, the program headers are set in stone. Yet to do all of this I needed to append new segments to the program header table. This turned out to be much harder than anticipated.

The program header table usually follows the ELF header and precedes the contents of the file. Appending new items to it would require resizing the table, which would require shifting up the addresses of everything that comes after it, invalidating pointers to the old addresses. As far as I can tell, it just can't be done without reinventing the linker itself.

I tried to move the table to the end of the file instead but couldn't get that to work either. My program was somehow segfaulting before it even reached the entry point, gdb was useless, I couldn't understand what was going on and was reduced to desperately dumping readelf output on stackoverflow in hopes someone would spot the problem. Well someone did and quickly at that but clearly this was not a sustainable software development strategy.

The `mold` linker saves the day

The linker is in a privileged position. It knows everything there is to know about the program's memory layout and can easily add new ELF segments to it seemingly at will. If a solution exists, I was convinced it would be found in the linker.

So I started learning linker script. Turns out it has a PHDRS command which is exactly what I needed but I couldn't figure out how to use it. I just kept getting "unable to allocate headers" errors no matter what I tried. I concluded it would be easier to simply ask for this feature instead.

So I emailed the GNU binutils mailing list... Then I created an LLVM issue...

Then I opened a mold issue. The maintainer immediately understood what I wanted to do and made it happen with what was essentially a single line of code change. Just beyond awesome.

I waited eagerly for the new release but got so excited I built this huge C++ project from source on my smartphone just to integrate lone with it. All I had to do was put -Wl,--spare-program-headers,2 in the LDFLAGS. It gave me two NULL program headers for my patching utility to edit any way it wanted. It worked perfectly.

So far mold is the only linker with this feature and it's absolutely required for lone-embed to work. It will outright refuse to patch the ELF if it doesn't find at least two NULL program headers in it.

Would be nice if the others gained this feature too. Unless I can figure out a way to move the program header table to the end of the file without breaking everything, I'm pretty much locked into using mold. Well, I don't really mind. It's an awesome linker and it's free software. I'm OK with this!