*This blog post is an informal introduction to a forthcoming research paper in collaboration with my students Koko Muroya and Todd Waugh Ambridge. This is the first post of a series dedicated to the SPARTAN calculus.*

Algebra, invented by the Babylonians in Antiquity and perfected by Muhammad al-Khwarizmi in the Middle Ages, is a formal language for performing calculations algorithmically. It is not a coincidence that '

*algorithm'*is derived from '

*al-Khwarizmi'*and '

*algebra'*from the word '

*al-jabr'*introduced by al-Khwarizmi, meaning "reunion of broken parts", or what we might prosaically call '

*reduction*'.

Yet, if algebra is the language of algorithmic calculations, then how is it different from a

*programming language*? This blog post will sketch out the beginning of an answer to this question. First we look at

*functional*programming languages, then we will progress to non-functional (dis-functional?) features in a follow-on post.

## Managing time

To illustrate by example, let us start with a simple algebraic expression:

(1+2)+(1+2)

How is it different from the same expression written in some programming language?

First of all, algebra has nothing to say regarding the order in which sub-expressions (1+2) should be evaluated. In contrast, programming languages (the vast majority) have a

*fixed*order of evaluation, usually left-to-right. This means that the expression above, seen as a an expression in a*programming*language, will evaluate like this:(1+2)+(1+2) =

3+(1+2) =

3+3 =

6

This is in contrast with this alternative, and reasonable, calculation of the

*algebraic*expression:(1+2)+(1+2) =

(1+2)+3 =

3+3 =

6

Both these sequences are valid for the

*algebraic*expression, but only the former is valid as an*execution*of the*program*corresponding to the expression (presuming a set order of evaluation).
With this in mind, we can now formulate the first and the most obvious distinction between an algebra and a programming language:

The order of evaluation may also involve "skipping" evaluating some sub-expressions altogether, as is the case of theIn a programming language the order of evaluation should be determined.

*if*statement. The programming expression

*if 0=0 then 1+2 else 3+4*should never require the evaluation of both branches. The guard

*0=0*should be evaluated, then, because it is

*true*, only

*1+2.*The

*3+4*branch should be discarded. We can indicate this graphically by putting expressions with

*deferred evaluation*in boxes. The execution of

*if*can be represented graphically as (ignoring for now the annotations in red):

= 3.

So by "

*order of evaluation*" we must understand more than a mere policy of traversal of the expression syntax tree, but also the possibility of deferring some evaluation altogether.## Managing copying

But what if we

*name*a subexpression in an algebraic expression? In informal mathematics the expression (1+2)+(1+2) is in all respects the same as

*let x be 1+2 in x+x.*However, in a programming language the

*cost*of the two phrases might be different (compiler optimisations aside):

- if addition is an expensive operations relative to other execution overheads, then the program
*let x be 1+2 in x+x*would run*faster*than (1+2)+(1+2) because it requires two additions, whereas the latter requires three additions - the in-memory
*size*of the two programs is also different.

For this very simple example the time and size cost differences may be negligible, but for large programs they can be strikingly different.

Let us represent the two expressions graphically.

(1+2)+(1+2) is

whereas

*let x be 1+2 in x+x*is
The node ⊗ corresponds to a "sharing" of sub-expressions, so the

*syntax tree*of the expression is actually a*directed acyclic graph*. This brings us to our second observation, thatA programming language should have a means of control over sharing of resources.

This is in fact the main role of

*variables*in a programming language: allowing us to specify how code is to be shared.
In terms of evaluation, sharing corresponds to

*copying.*For example,*let x be 1 in x+x*would first become 1+1 and then 2, i.e.
In more complex expressions there are options in terms of how copying is interleaved with other evaluations. For example

*let x be 1+2 in x+x*could first evaluate the shared sub-expression, going to*let x=3 in x+x*or copy the shared sub-expression wholesale, going to*(1+2)+(1+2).*Which brings us to the next key point:A programming language should have a defined policy for interleavingevaluationandcopying.

## SPARTAN: a core calculus

Let us now define a core calculus that starts from an*algebra*and takes into account sharing and the schedule of evaluation. The underlying algebra will have operations 𝜓 of some arity

*n*along with equations to be used in the course of evaluation. We will provide the arguments to an operation as 𝜓(

*x*1

*, x*2

*, ...*;

*t*1

*, t*2

*, ...*) in the understanding that the

*xi*arguments are always going to be evaluated left-to-right and

*ti*will be deferred. For example, addition is

*ADD(x, y; -)*and

*if*is

*IF(x ; t, f).*We call the arguments before the semicolon "

*eager*" and the rest

*"thunks".*Variables are introduced using the syntax

*bind x to u in v.*

The copying policy is minimalistic: copy only as little as possible. For example, the evaluation steps of

*bind x to 1+2 in x+x*is:

which eventually produces the final value 6.

### Functions as operations

The thunks in the definition of

*if*are quite simple, as the deferment of execution does not involve any variables. This is not the case when we need to deal with*functions.*Because when we think of a*functional*programming language we don't just think expressions, but also mechanisms to define and apply*functions*. We can handle abstraction and application as two special operations*LAMBDA(-; x.t)*and*APP(u, v; -)*. Function definition (*LAMBDA*) has no eager arguments and a thunk*t*with a special bound variable*x.*The equation for*LAMBDA*and*APP*is the one you would expect:*APP(LAMBDA (-; x.t), v; -) = bind x to v in t*

Note that it not redundant to provide all these:

*LAMBDA*and*x.t*and*bind x to v in t.*They are each subtly different and serve different purposes:*LAMBDA*is the*operation*of function definition*x.t*indicates that*x*must be instantiated when thunk*t*is forced (evaluated)*bind x to v in t*indicates that sub-term*v*is to be shared in*t*via*x*.

A simple example of using the constructs above differently are operations for defining and applying

*functions with two arguments*,*LAMBDA2(- ; xy.t)*and*APP2(t, u, v ; -)*with equation*APP2(LAMBDA2(-; xy.t), u, v) = bind x to u in bind y to v in t*

The astute and knowing reader will be perhaps slightly puzzled here. Why define a separate LAMBDA2 and APP2 combination of operations instead of just writing

*APP(APP(LAMBDA(-;x.(LAMBDA(-;y.t))), u), v)*?

After all, isn't

*(λxy.t)uv*the same as*((λx.λy.t)u)v)*? Yes and no. The two are the same in terms of the result, but they are not the same in terms of the*cost of execution.*The*APP2/LAMBDA2*combo requires**4 elementary operations**whereas the*APP/LAMBDA*encoding requires**13 basic operations**. Since functional programming requires a lot of abstractions and applications the cost difference in the execution of a program can add up to a lot. And in programming we*care about the cost;*otherwise we would not be doing programming but algebra.
Just because some

*Y*can be encoded into*X*it does not make it redundant. Of course, we know that ultimately*everything*can be encoded in the lambda calculus, yet we never use Church-encoding numerals in actual programming. All encodings have a cost, and encoding into higher-order functions is expensive.

The same consideration applies to

*recursion.*We can provide an operation*REC(-;x.t)*with an equation describing the unfolding of recursion:*REC(-;x.t) = bind x to REC(-; x.t) in t*

The story of recursion in functional languages is quite amuzing. Recursion can be encoded, like anything else, into the lambda calculus (e.g. the famous

*Y combinator*) but in this case it is not only more expensive than a built-in native recursor but also it interferes with the type structure. Types, of course, were introduced precisely to rule out recursion from the simply-typed lambda calculus! So recursors must be snuck into the language by one syntactic trick or another.
Now let's put it all together and write the most popular functional program, the factorial, using our core calculus of operations/equations, sharing/copying, and scheduled evaluation:

bind fact = REC (

; f.LAMBDA (

; x.IF (EQUALS (x, 1)

; 1

, bind y = APP (f, MINUS(x, 1))

in TIMES (x, y))))

in APP(fact, 3)

You can execute the program step-by-step in the on-line interactive visualiser at https://tnttodda.github.io/Spartan-Visualiser/?4 (preloaded as Example 4). You will see:

- the initial syntax tree of the expression, evolving as it is reduced
- the sharing of sub-expressions via variables, which makes the tree into a directed acyclic graph
- the thunks marked as boxes, which for the purpose of evaluation and copying are treated as single nodes.

Additionally you will see a pointer (or

*token*) which roams the program graph and which has three states:- when it is
**?**it is traversing the graph (depth-first search, left-to-right) - when it is
**↯**it is about to apply an equational rewrite - when it is ✔︎ it has recognised an operation with no rewrites (i.e.
*a value*) and is returning towards the root of the program.

If the factorial is confusing then try examples 1-3 first, which are simpler.

The execution of the factorial is screen-captured in this video (https://youtu.be/b2AEz02eenY):

*In a follow-up post I will discuss adding native effects to this calculus.*

## No comments:

## Post a Comment