# Understanding Evlis Tail Recursion

## Fred Akalin

While reading
about proper
tail recursion in Scheme, I encountered a similar but obscure
optimization called *evlis tail recursion*.
In the
paper where it was first described, the author claims it
"dramatically improve the space performance of many programs," which
sounded promising.

However, the few places where its mentioned don't do much more than state its definition and claim its usefulness. Hopefully I can provide a more detailed analysis here.

Consider the straightforward factorial implementation in
Scheme:^{[1]}

`(define (fact n) (if (<= n 1) 1 (* n (fact (- n 1)))))`

It is not tail-recursive, since the recursive call is nested in
another procedure call. However, it's *almost* tail-recursive;
the call to `*`

is a tail call, and the recursive call is
its last subexpression, so it will be the last subexpression to be
evaluated.

Recall what happens when a procedure call (represented as a list of
subexpressions) is evaluated: each subexpression is evaluated, and the
first result (the procedure) is passed the other results as
arguments.^{[2]}

Evlis tail recursion can be described as follows: when performing a
procedure call and during the evaluation of the last subexpression,
the calling environment is discarded as soon as it is not
required.^{[3]} The distinction
between evlis tail recursion and proper tail recursion is subtle.
Proper tail recursion requires only that the calling environment be
discarded before the actual procedure call; evlis tail recursion
discards the calling environment even sooner, if possible.

An example will help to clarify things. Given `fact`

as
defined above, say you evaluate `(fact 10)`

and you're in
the procedure call with `n = 5`

. The call stack of a
properly tail-recursive interpreter would look like this:

evalExpr -------- env = { n: 10 } -> <top-level environment> expr = '(* n (fact (- n 1)))' proc = <native function: *> args = [10, <pending evalExpr('(fact (- n 1))', env)>] evalExpr -------- env = { n: 9 } -> <top-level environment> expr = '(* n (fact (- n 1)))' proc = <native function: *> args = [9, <pending evalExpr('(fact (- n 1))', env)>] ... evalExpr -------- env = { n: 6 } -> <top-level environment> expr = '(* n (fact (- n 1)))' proc = <native function: *> args = [6, <pending evalExpr('(fact (- n 1))', env)>] evalExpr -------- env = { n: 5 } -> <top-level environment> expr = '(if ...)'

whereas the call stack of an evlis tail-recursive interpreter would look like this:

evalExpr -------- env = { n: 5 } -> <top-level environment> pendingProcedureCalls = [ [<native function: *>, 10], [<native function: *>, 9], ... [<native function: *>, 6] ] expr = (if ...)

In this implementation, the last subexpression of a procedure call is evaluated exactly like a tail expression, but the procedure call and non-last subexpressions are pushed onto a stack. Whenever an expression is reduced to a simple one and the stack is non-empty, a pending procedure call with its other args are popped off, and it is called with the reduced expression as the final argument.

Note that this didn't change the asymptotic behavior of the procedure; it still takes \(\Theta(n)\) memory to evaluate. However, only the bare minimum of information is saved: the list of pending functions and their arguments. Other auxiliary variables, and crucially the nested calling environments, aren't preserved, leading to a significant constant-factor reduction in memory.

This raises the question: Are there cases where evlis tail
recursion leads to better asymptotic behavior? In fact, yes; consider
the following (contrived) implementation of
factorial^{[4]}:

```
(define (fact2 n)
(define v (make-vector n))
(* (n (fact2 (- n 1)))))
```

Before the main body of the function, a vector of size \(n\) is
defined. This means that the environments in the call stack of a
properly tail-recursive interpreter would look like this:^{[5]}

env = { n: 10, v: <vector of size 10> } -> <top-level environment> env = { n: 9, v: <vector of size 9> } -> <top-level environment> env = { n: 8, v: <vector of size 8> } -> <top-level environment> env = { n: 7, v: <vector of size 7> } -> <top-level environment> ...

whereas the an evlis tail-recursive interpreter would keep around
only the current environment. Therefore, the properly tail-recursive
interpreter would require \(\Theta(n^2)\) memory to
evaluate `(fact2 n)`

while the evlis tail-recursive
interpreter would require only \(\Theta(n)\)

Studying examples like the one above enabled me to finally understand how evlin tail recursion worked and what sort of savings it gives. However, I have yet to find a practical example where evlis tail recursion gives the same sort of asymptotic gains as described above, and I'd be interested to receive some. But perhaps the "large gains" mentioned in the various papers describing it are only constant-factor reductions in memory.

In any case, another important difference in Scheme between proper
tail recursion and evlis tail recursion is that the former is
a *language feature* and the latter is
an *optimization*. That means that it is acceptable and even
encouraged to write Scheme programs that take advantage of proper tail
recursion, but it would be unwise to rely on evlis tail recursion for
the asymptotic performance of your function. Instead, one should
treat it just as a nice constant-factor speed gain.

Note that it is easy to make evlis tail recursion "smarter." Since
Scheme doesn't specify the order of argument evaluation, an
interpreter could evaluate arguments to maximize the gains from evlis
tail recursion. As an easy example, if we had switched the arguments
to `+`

in `fact`

above, making it
non-evlis-tail-recursive, a smart compiler could still treat it as
such. A possible rule of thumb would be to pick a non-trivial
function call to evaluate last.

To complete the picture, I will outline below the evaluation function for a simple evlis tail-recursive Scheme interpreter in Javascript. All of the sources I've found describe it in terms of compilers, so I think it'll be useful to have a reference implementation for an interpreter.

Let's say we already have a properly tail-recursive
interpreter:^{[6]}

```
function evalExpr(expr, env) {
// Fake tail calls with a while loop and continue.
while (true) {
// Symbols, constants, quoted expressions, and lambdas.
if (isSimpleExpr(expr)) {
// The only exit point.
return evalSimpleExpr(expr, env);
}
// (if test conseq alt)
if (isSpecialForm(expr, 'if')) {
expr = evalExpr(expr[1], env) ? expr[2] : expr[3];
continue;
}
// (set! var expr)
if (isSpecialForm(expr, 'set!')) {
env.set(expr[1], evalExpr(expr[2], env));
expr = null;
continue;
}
// (define var expr?)
if (isSpecialForm(expr, 'define')) {
env.define(expr[1], evalExpr(expr[2], env));
expr = null;
continue;
}
// (begin expr*)
if (isSpecialForm(expr, 'begin')) {
if (expr.length == 1) {
expr = null;
continue;
}
// Evaluate all but the last subexpression.
for (var i = 1; i < expr.length - 1; ++i) {
evalExpr(expr[i], env);
}
expr = expr[expr.length - 1];
continue;
}
// (proc expr*)
var proc = evalExpr(expr.shift(), env);
var args = expr.map(function(subExpr) { return evalExpr(subExpr, env); });
// proc.run() returns its body in result.expr and the environment
// in which to evaluate it (with all its arguments bound) in
// result.env.
var result = proc.run(args);
expr = result.expr;
// The only time when env is changed.
env = result.env;
continue;
}
}
```

Then implementing evlis tail recursion requires only a few changes:

```
function evalExpr(expr, env) {
// This is a stack of procedures and their non-final arguments that
// are waiting for their final argument to be evaluated.
var pendingProcedureCalls = [];
while (true) {
if (isSimpleExpr(expr)) {
expr = evalSimpleExpr(expr, env);
// Discard calling environment.
env = null;
if (pendingProcedureCalls.length == 0) {
// No pending procedure calls, so we're done (the only exit
// point).
return expr;
}
var args = pendingProcedureCalls.pop();
var proc = args.shift();
args.push(expr);
var result = proc.run(args);
expr = result.expr;
// Change to new environment (the only time when env is
// changed).
env = result.env;
continue;
}
...
// Everything else remains the same.
...
// (proc expr*)
var nonFinalSubExprs =
exprs.slice(0, -1).map(
function(subExpr) { return evalExpr(subExpr, env); });
pendingProcecureCalls.push(nonFinalSubExprs);
// Evaluate the last subexpression as a tail call.
expr = expr[expr.length - 1];
continue;
}
}
```

[1] Assume a left-to-right evaluation order for now. ↩

[2] The function that takes a list of expressions, evaluates them,
and returns the results as a list is traditionally
called `evlis`

, hence the name of the optimization.
↩

[3] This assumes that the calling environment isn't stored somewhere else. ↩

[4] This was adapted from an example in Proper Tail Recursion and Space Efficiency. ↩

[5] Assume that the interpreter isn't smart enough to deduce that \(v\) can be optimized out since it's never used. ↩