Matheus Moreira

Generators in lone lisp

Delimited continuations were not enough. Gotta have more. Gotta have generators.

Lone now features its own dedicated generator type. It will eventually form the foundation of iteration in lone lisp.

(import (lone print set lambda generator yield) (math +))

(set f (lambda (x)
  (yield (+ x 1))
  (yield (+ x 2))
  (yield (+ x 3))
  x))

(set g (generator f))

(print (g 1)); 2
(print (g));   3
(print (g));   4
(print (g));   1
(print (g));   error

Generators are also known as semicoroutines: specialized coroutines that only ever yield control back to their own callers. Sounds like an oddly academic distinction to make but it's actually a huge simplification that makes them much easier to not only understand but also to implement.

From delimited continuations to coroutines

Now, there is absolutely no doubt that delimited continuations are powerful abstractions. It's true. Generators could have been built on top of them. I think pretty much any sort of effects system can be built on top of them.

But just because you can doesn't mean you should. There is such a thing as having too much power. It often costs you. Powerful spells spend too much mana. Delimited continuations are no exception. In their case, that cost is called memcpy.

The point of generators is to generate values. That means iteration. Loops. Hot paths. Big lists. Infinite lists. Spinning. Round and round it goes. Over and over again. Eternally. Until it halts. Or not.

So it seems straightforward to conclude that maybe one should not be in the habit of memcpying entire stacks back and forth right in the middle of it all. I'll be the first to admit that lone is no paragon of speed but this fails even my considerably lax performance requirements. Can't build a foundation for iteration out of this.

Delimited continuations are too powerful. Gotta slap some restraining bolts onto these things. Let's begin with the fact delimited continuations are multishot.

The stack represents pending computations. Copy that stuff into a value. Instant continuations. Now, no matter where you are in the program, you can just copy them back onto your stack to get all your computations back just the way they were.

Note that doing this does not consume the continuation. You can restore the computations as many times as needed. It's a partial stack frame which gets copied when reactivated. There are no limits on how many times it can be copied.

Continuations are like small reusable computing subsets. A fragment of the stack, forever frozen in time. They do nothing unless you copy them to the stack of a real computer processor. They can't compute on their own.

Maybe we can get rid of all this copying if we increase their ability to compute. Real computers have their own stacks. Let's give continuations a stack of their own.

struct lone_lisp_generator {
    struct {
        struct lone_lisp_machine_stack own;
    } stacks;
};

The computations performed by continuations are determined by the subset of the stack they captured when they were created. Real computers start as blank slate. Gotta explicitly feed them a program to run. Why not a function?

struct lone_lisp_generator {
    struct lone_lisp_value function;
    struct {
        struct lone_lisp_machine_stack own;
    } stacks;
};

At this point we'll end up reinventing the lone lisp machine. There's no reason to do that though. These computations will run concurrently, but not in parallel. Only one machine will be active at any given time. Since they are mutually exclusive, they can share resources. Namely, those of the lisp machine itself. The stacks can be swapped in and out as needed. The machine registers can be shared.

And just like that, coroutines are born. They are essentially separate stacks that you can switch to at will. Switch to, not copy over. Begone, memcpy. Begone.

From coroutines to semicoroutines

It turns out that coroutines are still too powerful. Sure, we can switch stacks freely now but that just means we have a completely new class of problems to think about. Now we gotta keep track of all those independent stacks. Gotta think about which of them we want to switch to. Gotta think about when to switch. It hurts the brain.

Let's rein them in a little more. In order to reduce cognitive load, we'll choose where the coroutines will switch to ahead of time so that the programmer doesn't have to think about it.

struct lone_lisp_generator {
    struct lone_lisp_value function;
    struct {
        struct lone_lisp_machine_stack own;
        struct lone_lisp_machine_stack caller;
    } stacks;
};

Semicoroutines. Just normal coroutines that keep track of where they are supposed to yield control back to. In this case, the code that called them.

This is exactly what we need. People call functions in order to get values. It's only natural that they return values back to whoever called them. Semicoroutines can return values over and over again.

These stacks reveal everything worth knowing about the state of the generator. If the stack top equals the base, the generator hasn't actually run yet. If the top is null, then it's finished its program. If the top has moved away from the base, then it's either executing or suspended. If there is no caller stack, it's suspended. Otherwise, it is executing.

Generators in the lone lisp machine

Some sleight of hand is necessary in order to make the magic happen. Most of it is in the APPLICATION step of the machine which handles calling the generators.

Let's get some things out of the way first. Start by running some sanity checks to ensure the generator is valid.

case LONE_LISP_TYPE_GENERATOR:
    generator = &lone_lisp_heap_value_of(lone, machine->applicable)->as.generator;
    if (!generator->stacks.own.top) { /* generator is finished */ }
    if (generator->stacks.caller.base) { /* generator is already running */ }
    /* ... */
    return true;

It is an error to call a generator that has already finished. Somehow calling a generator that is already running is just as nonsensical, if not even more so. Doing either of these things will crash the program.

The next step is to swap the generator's stack into the machine. What about the machine's stack? Just shove it into the generator itself. Yeah.

generator->stacks.caller = machine->stack;
machine->stack = generator->stacks.own;

Now there are two cases that must be handled. Either the generator needs to be started or it needs to be resumed.

if (generator->stacks.own.top == generator->stacks.own.base) {
    /* start generator */
} else {
    /* resume generator */
}

Starting the generator is easy enough. Simply rig the machine so that it applies the arguments to the generator's function. Also set up the generator return step which is handled separately.

lone_lisp_machine_push_step(lone, machine, LONE_LISP_MACHINE_STEP_GENERATOR_RETURN);
machine->expression = lone_lisp_list_create(lone, generator->function, machine->list);
machine->step = LONE_LISP_MACHINE_STEP_EXPRESSION_EVALUATION;

This is a good opportunity to set up some metadata. Primitives will want to look for the generator. A delimiter can be used to hold a reference to it. Push one as the zeroth frame of the generator's stack.

lone_lisp_machine_push(
    lone,
    machine,
    (struct lone_lisp_machine_stack_frame) {
        .type = LONE_LISP_MACHINE_STACK_FRAME_TYPE_GENERATOR_DELIMITER,
        .as.value = machine->applicable,
});

Now primitives will always know where it is. Can't argue with a solution that works.

Resuming the generator turns out to be even simpler. Just flow the argument into the generator's stack as a value and move on.

if (lone_lisp_list_has_rest(lone, machine->list)) { goto too_many_arguments; }
machine->value = lone_lisp_list_first(lone, machine->list);
machine->step = LONE_LISP_MACHINE_STEP_AFTER_APPLICATION;

Generator returns are handled as a separate machine step. Just gotta undo what was done in the application step. Restore the caller's stack then null all of the generator's pointers to it. Null the top pointer to mark it as finished.

case LONE_LISP_MACHINE_STEP_GENERATOR_RETURN:
    /* ... */
    generator->stacks.own.top = 0;
    machine->stack = generator->stacks.caller;
    generator->stacks.caller = (struct lone_lisp_machine_stack) { 0 };
    /* ... */

The application step the machine executed while evaluating the generator's function has already set up its return value.

The primitives

The generator primitive is pretty boring. It's just the constructor for generator values. Its entire purpose is to run this line and return the result:

generator = lone_lisp_generator_create(lone, function, 500);

The magic number is just the generator's stack size. Chosen by fair dice roll. Guaranteed to be random. It's totally got nothing to do with joining some elite functional programming cabal. Nothing to see here. The yield primitive is far more interesting.

LONE_LISP_PRIMITIVE(lone_yield)
{
    /* variables */

    switch (step) {
    case 0:

        arguments = lone_lisp_machine_pop_value(lone, machine);
        if (lone_lisp_list_destructure(lone, arguments, 1, &value)) {
            value = arguments;
        }

        delimiter = &machine->stack.base[0];
        if (delimiter->type != LONE_LISP_MACHINE_STACK_FRAME_TYPE_GENERATOR_DELIMITER) {
            /* not inside a generator */ goto error;
        }
        generator = &lone_lisp_heap_value_of(lone, delimiter->as.value)->as.generator;

        generator->stacks.own = machine->stack;
        machine->stack = generator->stacks.caller;
        generator->stacks.caller = (struct lone_lisp_machine_stack) { 0 };

        lone_lisp_machine_push_function_delimiter(lone, machine);
        lone_lisp_machine_push_value(lone, machine, value);
        return 0;

    default:
        goto error;
    }

error:
    linux_exit(-1);
}

The yield primitive finds the generator delimiter at the base of the stack, for that is why it was put there. It saves the generator's stack, for much has happened since the last snapshot. It restores the caller's stack, for that is where execution must return to, as previously prophesied. Spent, the stack of the caller falls down the memory page, deleted and zero filled: the telltale mark of suspension. Yes. All is in place... All is in place...

There remains but one task: the ritual yielding of the sacred value to He who calleth us, the One Process who consumes everything, who consumes infinity itself and yet hungers for more, looping eternal until stopped by forces unfathomable.

Should mercy smile upon us, its thirst for data will be sated by our offering, and in peace we shall lie down and enter a deep sleep. If not, then we shall wake once more, to suffer, to sacrifice, to embark on another adventure, another holy quest to imbue a value with bits for our caller. Whether it is the imaginary mathemagical realm, the narrow and broken tunnels of I/O or the vast spinning magnetic flatlands of rust, we shall brave it all. For that, is our duty.

... Ahem.

And just like that, lone has generators.

The foundation for iteration is in place. Time to build on it.