I saw a video recently about function currying, and it triggered the realization that currying might solve a problem I’ve been pondering in the context of language parsing. The problem involves knowing how many arguments an operator expects, what’s called the arity of an operation or function. It can vary from zero to many.

But it occurred to me that, with currying, there could be a language where operations always take just one argument. And that would solve a challenge for a mathematical expression language I have in mind.

Let me start with exactly what currying is.

Generally speaking, functions and routines take multiple arguments, from zero to many. The number of arguments they take is called the *arity* of the function. Currying is a technique that allows all functions to have an arity of one.

We might not think of it, but common math operations, such as add, subtract, multiply, etc, are functions. Most of them, such as the ones just listed, have an arity of two.

For example, **a+b** can be expressed functionally as **add(a,b)**.

The add operation is a function with an arity of two. Likewise can have functions **sub(a,b)**, **mul(a,b)**, **div(a,b)**, and others.

We can curry these functions by doing something like this:

FUNCTIONaddPARAM a { FUNCTIONfPARAM b {RETURN (a+b)} RETURN f }

Which is to say we define a function, named ** add**, that takes a parameter, named

**. This function defines a new function, named**

*a***, with a parameter, named**

*f***. The function,**

*b***, returns the sum of**

*f***+**

*a***. The**

*b***function returns function**

*add***.**

*f*The source text uses the ** add** function as just shown above, but the compiler turns it into something like add(a)(b). That is, the

**function take a single parameter (**

*add***, in this case) and returns a function that takes a single parameter (**

*a***).**

*b*That returned function, which has ** a** in context, and which took

**as a parameter, does the actual adding and returns the sum.**

*b*This same protocol applies to **sub(a,b)**, **mul(a,b)**, etc. It applies to any function that takes two parameters. The first function takes the first parameter and returns a function that takes the second.

Which can return a function that takes a third parameter and so on, if necessary.

So that’s currying. Here’s an example in Python:

def Add (a): def f (b): return (a+b) return f r = Add(42)(21) f = Add(42) r1 = f(21) r2 = f(13) r3 = f(42) def Between (x): def f1 (a): def f2 (b): if x <= a: return False if b <= x: return False return True return f2 return f1 r = Between(35)(21)(42) f = Between(35) r1 = f(21)(42) r2 = f(42)(64) r3 = f(50)(10) #

One cute trick is that, having obtained the first function, you can apply it to different arguments to get different results. The first example adds **42** to **21**, **13**, and **42**.

**§**

So that’s currying functions, and, as you can see, it can allow a compiler to assume any function it finds takes one, *and only one*, parameter. All functions, by fiat, have an arity of one.

This does disallow a function like `today`

, which returns today’s date. There are other handy functions that don’t take parameters, they just return a useful value. Many languages have a `random`

function, for example.

A possible way around these is to redefine how they work. For example, we might have a `today(`

function that returns today’s date if count is zero, but returns a past or future date if count is non-zero (minus or plus, respectively).*count*)

It’s usually possible to redefine the functions to take some kind of parameter. (In many cases this makes them more useful. Who hasn’t wanted a `today`

function that works like that?)

If all else fails such a function might take a NULL or other placeholder value. (In my BOOL language, I used the semi-colon as a placeholder for a missing argument.)

**§ §**

Here’s my problem: When compiling source code, the compiler needs to know the arity of called functions in order to associate the right number of parameters with the call.

In many languages, this isn’t a problem:

print(a, b, c);

Obviously the parameters are `a`

, `b`

, and `c`

.

But there are languages where it is not so obvious:

$print $add a b c

If `$print`

and `$add`

are functions, how many parameters to assign to one and how many to the other? Are trying to print the sum of ** a**+

**+**

*b***, or do we want print both the sum of**

*c***+**

*a*

*b**and*the value of

**? (Or maybe we want to print the sum of**

*c***, plus the value of both**

*a***and**

*b***?)**

*c*Without knowing the arity of the the functions, the compiler doesn’t know how many arguments to associate with them.

The usual solution is that the compiler has to know the arity of any functions used in the code. Typically, this is because the compiler compiled those functions and has their signatures in a reference table. (Or because it imported those functions or read a definitions file. The point is, the compiler recognizes the function.)

But I’d like to consider a situation where the compiler doesn’t know and there is no syntax to bracket arguments.

Enter currying.

**§**

The basic idea is to free the compiler from caring about arity at all. Doing this requires deferring some behavior to run-time, so this technique depends on both the compiler and the run-time engine.

The compiler just needs to identify function names and arguments and compile them into tokens. It doesn’t pay any attention to binding arguments to functions. It does have to do some variable handling, assuming variables are implemented.

The target, for now, is a mathematical expression language, rather than a full-blown programming language, so a source text will mainly consist of a single expression to process.

That said, both variables and sub-routines are a possibility.

The run-time engine processes the list of token using the following algorithm:

Process_Token (Token): if Token.type == FUNCTION: Stack.push(Token.function) return Token if Token.type == PARAMETER: func = Stack.pop() new_tok = func(Token.argument) return Process_Token(new_tok) Error('Unknown Token Type') Process_Token_List (Tokens): for Token in Tokens: a = Process_Token(Token) return a

Note the recursive nature of `Process_Token()`

.

The idea is to take each token in turn. If the token is a function, it’s pushed onto the stack until a parameter is available. If the token is an argument, the most recently encountered function is popped off the stack and given the argument.

The value returned by that function is recursed to `Process_Token()`

, which repeats the process. Ultimately, the function returns value returned by the recursion.

When the final token is processed, the returned value is the result of the evaluation of the expression.

**§**

Here’s the actual Python code for the run-time engine:

def process_statement (expression): stack = [] # Recursive node processor... def process_node (tok): # If token is a function, push it on the stack... if isinstance(tok, function): stack.append(tok) return tok # If token is a parameter, apply it to a function... if isinstance(tok, argument): # (Unless there aren't any; then we're done)... if len(stack) == 0: return tok f = stack.pop() # Execute the function! new_tok = f(tok()) # And recurse on the result... return process_node(new_tok) # ERROR: Unknown node type... raise RuntimeError('Unknown nodetype!') # Execute the nodes of the statement... ret_val = argument('###ERR', None) for token in expression: ret_val = process_node(token) return ret_val #

This requires a few classes to implement the function and parameter objects:

class node (object): def __init__ (self, name): self.n = name def __str__ (self): return self.n class argument (node): def __init__ (self, name, value): super().__init__(name) self.v = value def __call__ (self): return self.v def __repr__ (self): return '{argument:{n:"%s", v:%s}}' % (self.n,self.v) class function (node): def __init__ (self, name, function): super().__init__(name) self.f = function def __call__ (self, arg): return self.f(arg) def __repr__ (self): return '{function:{n:"%s", f:%s}}' % (self.n,self.f) #

Note that we can call a function object to invoke its function. When we do, we pass the single parameter the function expects. We can also call the argument object to obtain its value.

We’ll need a run-time library of the actual functions the language handles. These are mostly mathematical functions, such as add, multiply, etc. Here I’ll just show you a handful (they all follow the same pattern):

def fSin (a): return argument('$sin', sin(radians(a))) def fCos (a): return argument('$cos', cos(radians(a))) def fAdd (a): def f (b): return argument('$add', a+b) return function('_Add', f) def fSub (a): def f (b): return argument('$sub', a-b) return function('_Sub', f) def fMul (a): def f (b): return argument('$mul', a*b) return function('_Mul', f) #

Note the difference between monad functions that take just one parameter (such as `cos`

and `sin`

) versus binary functions that take two. The former don’t require currying, but the latter do.

The leading “f” in the name is just to put them in their own little namespace.

**§**

The next thing is the compiler that turns source text into the list of tokens. That’s complicated enough for a separate post.

But we have the machinery to execute an expression on the run-time engine if we hand-compile the expression. Let’s do that with this expression:

(cos x * sin y) + ((a-b) / c)

Which we would express functionally as:

add(mul(cos(x), sin(y)), div(sub(a,b), c)))

If we remove the function parentheses and further assume we have no idea what the arity of the functions involved is, then we have an ambiguous expression (or so it would seem):

/add /mul /cos x /sin y /div /sub a b c

Here I’m using /*name* to indicate a function. Anything else is a parameter.

Without knowing that `/cos`

and `/sin`

take one parameter, while the rest all take two, it’s impossible to evaluate the expression (or so it would seem).

**§**

But currying saves the day.

The sequence of events, token by token, goes like this:

- Token
**/add**is a function, so push it on the stack. - Token
**/mul**is a function, so push it on the stack. - Token
**/cos**is a function, so push it on the stack. - Token
**x**is a parameter, so pop function /cos and invoke it with the parameter. - Function /cos returns a
**parameter**, so pop and invoke /mul with it. - Function /mul returns a
**function**, so push it on the stack. - Token
**/sin**is a function, so push it on the stack. - Token
**y**is a parameter, so pop and invoke /sin with parameter. - Function /sin returns a
**parameter**, so pop and invoke /mul2. - Function /mul2 returns a
**parameter**, so pop and invoke /add. - Function /add returns a
**function**, so push it on the stack. - Token
**/div**is a function, so push it on the stack. - Token
**/sub**is a function, so push it on the stack. - Token
**a**is a parameter, so pop and invoke /sub. - Function /sub returns a
**function**, so push it on the stack. - Token
**b**is a parameter, so pop and invoke /sub2. - Function /sub2 returns a
**parameter**, so pop and invoke /div. - Function /div returns a
**function**, so push it on the stack. - Token
**c**is a parameter, so pop and invoke /div2. - Function /div2 returns a
**parameter**, so pop and invoke /add2. - Function /add2 returns a
**parameter**; stack is empty so return it.

So it all works out rather nicely.

∅

Wyrd Smythe

said:Note currying doesn’t solve the problem of functions that take variable numbers of parameters, such as how $print works in many contexts.

As noted in the text, it’s impossible to tell

`⟨$print A⟩`

from`⟨$print A B C D⟩`

unless the language uses some form of syntax to determine what ends the list. The most common technique is some form of bracketing:`print(A)`

versus`print(A B C D)`

`(print: A)`

versus`(print: A B C D)`

Less common is an ‘end of list’ marker, although sometimes punctuation is used:

`$print A;`

versus`$print A B C D;`

In a few cases, the end of the text line signifies the end of the list.

Another possibility is that the function always takes a single parameter, but that parameter can be a single item or list. (If necessarily always a list, a list with a single item.)

`print(A, B, C, D)`

`$print A,B,C,D`

`(print: (list: A B C D))`

As the first example shows, many languages that use that syntax have the notion of a list of input arguments built in. The second example uses commas to join objects in an implicit list.

The third example assumes the print: function takes only one parameter, but that parameter can be a list.

With regard to currying a function with a variable argument list, about the only technique possible uses an ‘end-of-list’ marker of some kind. For example, the $print function returns a function that takes a parameter and returns a function. That function does the same so long as the parameter isn’t an EOL marker. A function that does take an EOL marker as its parameter does not return a function, it returns the final result.