Xerl: intermediate module

2013 25 Mar

Today we will start the work on the intermediate module that will be used to run the code for the expressions found in our file's body, replacing our interpreter.

This is what we want to have when all the work is done:

xerl -> tokens -> AST -> intermediate -> cerl

Today we will perform this work only on the atomic integer expression however, so we will not build any module at the end. We have a few more things to take care of before getting there. This does mean that we completely break compilation of modules though, so hopefully we can resolve that soon.

This intermediate representation is in the form of a module which contains a single function: run/0. This function contains all the expressions from our Xerl source file.

In the case of a Xerl source file only containing the integer 42, we will obtain the following module ready to be executed:

-module('$xerl_intermediate').
-export([run/0]).

run() ->
    42.

Running it will of course give us a result of 42, the same we had when interpreting expressions.

The resulting Core Erlang code looks like this:

module '$xerl_intermediate' ['run'/0]
    attributes []
'run'/0 =
    fun () ->
        42
end

The nice thing about doing it like this is that other than the definition of the intermediate module and its run/0 function, we can use the same code we are using for generating the final Beam file. It may also be faster than interpreting if you have complex modules.

Of course this here only works for the simplest cases, as you cannot declare a module or a function inside another Erlang function. We will need to wrap these into function calls to the Xerl compiler that will take care of compiling them, making them available for any subsequent expression. We will also need to pass the environment to the run function to keep track of all this.

This does mean that we will have different code for compiling fun and mod expressions when creating the intermediate module. But the many other expressions don't need any special care.

Right now we've used the '$xerl_intermediate' atom for the intermediate module name because we only have one, but we will need to have a more random name later on when we'll implement modules this way.

The attentive mind will know by now that when compiling a Xerl file containing one module, we will need to compile two intermediate modules: one for the file body, and one for the module's body. Worry not though, if we only detect mod instructions in the file body, we can just skip this phase.

While we're at it, we'll modify our code generator to handle lists of expressions, which didn't actually work with integer literals before.

We're going to use Core Erlang sequences for running the many expressions. Sequences work like let, except no value is actually bound. Perfect for our case, since we don't support binding values at this time anyway.

Sequences have an argument and a body, both being Core Erlang expressions. The simplest way to have many expressions is to use a simple expression for the argument and a sequence for the rest of the expressions. When we encounter the last expression in the list, we do not create a sequence.

The result is this very simple function:

comp_body([Expr]) ->
    expr(Expr);
comp_body([Expr|Exprs]) ->
    Arg = expr(Expr),
    Body = comp_body(Exprs),
    cerl:c_seq(Arg, Body).

In the case of our example above, a sequence will not be created, we only have one expression. If we were to have 42, 43, 44 in our Xerl source file, we would have a result equivalent to the following before optimization:

-module('$xerl_intermediate').
-export([run/0]).

run() ->
    42,
    43,
    44.

And the result is of course 44.

The resulting Core Erlang code looks like this:

module '$xerl_intermediate' ['run'/0]
    attributes []
'run'/0 =
    fun () ->
        do  42
            do  43
                44
end

Feels very lisp-y, right? Yep.