How do I unroll (compile) an interpreter loop? How do I unroll (compile) an interpreter loop? c c

How do I unroll (compile) an interpreter loop?


"Unrolling a loop" normally means replacing a repetition with a sequence of actions. The loop:

for (int i = 0; i < 4; ++i) {    a[i] = b[i] + c[i];}

would unroll into the equivalent:

a[0] = b[0] + c[0];a[1] = b[1] + c[1];a[2] = b[2] + c[2];a[3] = b[3] + c[3];

It appears to me that whoever was being quoted by Wikipedia was using the phrase in a somewhat metaphorical sense. So, in that sense...

Your sample would normally be invoked inside a interpreter that is walking a tree of AST nodes, which might look something like this:

 ASSIGN    | +--+---+ |      |REF   MINUS |      | x   +--+---+     |      |    VAR    PLUS     |      |     a   +--+--+         |     |        VAR  CONST         |     |         b     3

and the interpret function would have additional options:

int interpret(node) {    switch(node) {        case PLUS:             return interpret(child(0))+interpret(child(1));        case MINUS:             return interpret(child(0))-interpret(child(1));               case ASSIGN:             return set(child(0), interpret(child(1));        case VAR:             return fetch(child(0));        case CONST:             return value(child(0));        ...    }}

If you walk the AST with that interpet function (actually performing the operations), you're interpreting. But if the function records the actions to be performed, rather than executing them, you're compiling. In pseudocode (actually, pseudocode twice, as I'm assuming a hypothetical stack machine as the compilation target):

string compile(node) {    switch(node) {        case PLUS:             return(compile(child(0))) + compile(child(1)) + ADD);        case MINUS:             return(compile(child(0))) + compile(child(1)) + SUB);        case ASSIGN:             return(PUSHA(child(0))) + compile(child(1)) + STORE);        case REF:             return(PUSHA(child(0)));        case VAR:             return(PUSHA(child(0)) + FETCH);        case CONST:             return(PUSHLIT + value(child(0)));        ...    }}

Invoking compile on that AST (ignoring any pseudocode typos ;-) would spit out something like:

PUSHA xPUSHA aFETCHPUSHA bFETCHPUSHLIT 3ADD SUBSTORE

FWIW, I'd tend to think of that as unrolling the AST, rather than unrolling the interpreter, but won't criticize somebody else's metaphor without reading it in context.


I'm slightly confused. I don't think 'unrolling the loop' is the right term here. Even if you refactor the code to not have any recursive calls, you will still be using an interpreter.

You can compile this program with GCC. Then you will have a compiled program, albeit the compiled program will be interpreting the AST.

One way to turn this into a compiler would be, instead of doing return interpret(child(0))+interpret(child(1));, you would generate assembly instructions which would do the addition instead, and then output those to a file.


You don't really have a loop here since not all the calls to interpret are tail calls.

The compiler closest to yours, assuming a stack model...

int compile(node){    switch(node) {        case PLUS:             return compile(child(0))&&compile(child(1))&&compile_op(op_plus);        case MINUS:             return compile(child(0))&&interpret(child(1))&&compile_op(op_minus);           }}

but I think unrolling in this context is more applicable to a byte code interpreter rather than an AST interpreter. The bytecode instructions are typically interpreted in a loop. Then the "unrolling" technique is to emit the code corresponding to each bytecode instruction.

Factor is similar to FORTH. Typically FORTH has an outer interpreter that generates threaded code. The threaded code can be envisioned an array of function pointers (there are several variants, direct threaded, indirect threaded, subroutine threaded, etc.). The threaded code is executed by an inner interpreter. Unrolling the interpreter in this case refers to the inner interpreter, and is a matter of concatenating the threaded code.