I’ll pick up with the language I began describing last time in a future post. Right now I want to pick up the thread of Little Programming Languages (LPLs) and use several examples to illustrate what underlies a programming language. (And as it turns out, these are “little” only in a certain sense.)
This first example, LPL-3, is Lisp-like and, because of that, is fairly orthogonal. Even better, it’s probably actually usable, although — like Lisp — it doesn’t have the cleanest syntax (languages like Python have really spoiled me).
A key goal when I make up (I hate to dignify it with “design”) an LPL is the single-syntax model. I like a language with minimal — one, if possible — syntactical construct. For example, the language in the previous post used “function signature” style format:
Our current language, LPL-3, uses a Lisp style format:
A programming language requires three things. How it implements those things is a big part of what defines a programming language, what makes it what it is.
Little Programming Language — 3
The first requirement is the ability to save state — in a word: variables. To be a programming language, there must be some kind of variable. In LPL-3, the
var keyword indicates a variable:
We need to know where the data is to access it. The
label keyword provides a reference location for any enclosed object:
(label: name object)
So to label a variable (so it can be referenced later):
(label: name (var: data))
(label: digits (var: 0 1 2 3 4 5 6 7 8 9))
We presume the system identifies the data type (array of int) for us, but we can add a notation for data type as an “option” of the
(label: Ultimate-Answer (var/int: 42)) (label: digits (var/array/int: 0 1 2 3 4 5 6 7 8 9)) (label: test-string (var/string: "Hello, world!")) (label: Halloween! (var/date: 10/31/2015))
As convenient syntactical sugar, we can allow a parser to recognize the type option keywords as implying the
var keyword and therefore allowing it be optional:
(label: Ultimate-Answer (int: 42)) (label: digits (array/int: 0 1 2 3 4 5 6 7 8 9)) (label: test-string (string: "Hello, world!")) (label: Halloween! (date: 10/31/2015))
Note that the
array type option is optional in this case because the data itself is an array. More importantly, note that array-ness is a type option for any variable! Being an array is just a property a variable can have.
In fact, both the
array and the
int type options are just that: options. But without using at least one of them, the
var keyword is required.
As long as we’re on the topic of syntactical sugar, the need to wrap a
var object in a
label object is cumbersome. But without a label, the data (generally) can’t be accessed! Let’s invent a syntax form that allows us to insert an implied
label wrapper into any
(int[Ultimate-Answer]: 42) (array[digits]: 0 1 2 3 4 5 6 7 8 9) (string[test-string]: "Hello, world!") (date[Halloween!]: 10/31/2015)
Note that the nature of this language allows names (labels) to contain characters not usually allowed by most languages. Spaces (and control characters) are prohibited, of course, and names may not start with punctuation.
And that we now have a fairly decent way to express typed variables! If we allow the language to define new types, presumably we’ll allow those new types to be used as type options.
The second requirement for a programming language (state was a big one) is selection. At the least, we’re talking some kind of
if construct here (and
if is all that’s absolutely required (XSLT gets along fine with just that)).
But most languages implement an
if-else, if not an
if-elseif-else (although the exact wording varies). More sophisticated selection constructs include the
case statements along with more exotic conditionals.
In LPL-3 we have the
(if: expression (then-items) (else-items) )
If the expression evaluates true, The list of then-items is evaluated otherwise the else-items list is.
We can also define a switch construct:
(select: expression (case: expression-1 case-items) (case: expression-2 case-items) (else: else-items) )
This works in the obvious way by evaluating expression and then seeking a match in the
Speaking of expressions, we’ll use expression term objects:
Math: (+ A B) (- A B) (^ A B) (% A B) … Relational: (< A B) (= A B) (> A B) … Logical: (! A) (& A B) (| A B) … …etc…
LPL-3 supports all the usual math operators and any others that strike our fancy (I rather like an “is-it-Thursday” operator, because I can never get the hang of Thursdays).
Note the lack of colon on an expression object. In LPL-3, the colon denotes a language keyword. Without it, the parenthesis just enclose a list of items. (Expressions are treated by the expression evaluator as prefix expression lists.)
The third requirement for a programming language is either recursion or iteration. Many, if not most, programming languages provide both.
Iteration (loops) is pretty obvious considering what we’ve done already. We just add
(while: expression (loop-items)) (until: expression (loop-items))
Pretty simple! The loop-items are repeated depending on the result of evaluating expression.
The for loop is one of the more idiomatic language constructs — one of those places where different languages vary a lot. LPL-3 implements a list iteration version:
(for: name (list-items) (loop-items) )
Recursion requires callable code units — functions. Implementing them is non-trivial, but without them a language isn’t small, it’s tiny. But note that functions are not absolutely required for a programming language.
In LPL-3, the syntactic sugar form for defining a function is:
(The sugar-free form wraps a
label object around the
function object, same as with the
var objects.) The
out objects are optional (which is why they require keywords). Their order among the other
function items is not significant.
To invoke a function, prefix the name with the at-sign (@) and use the result as a keyword in an object. Any items in the object are inputs to the function:
Since all programming languages require most of what LPL-3 has, the question is what makes LPL-3 (or any LPL) small? It’s everything else the language supports from this point on. Not just what is supported, but how well.
For instance, so far LPL-3 is a toy because it has no input or output ability. Let’s define a simple output object:
That’ll just print the items in the list to the default output (screen or printer).
Input is another area where languages vary a lot. I/O is tricky in general, because it’s so dependent on the underlying system. Not just the O/S, but the user’s particular configuration. To top it off, the number of different input and output devices is truly astonishing (and new ones come along constantly).
So for now LPL-3 shall remain input-less.
Another area rich in diversity among languages is how they handle strings and how rich a set of operations they support on strings. We’ll just say that LPL-3 supports the same set of string operations as Python does. (Likewise with arrays.)
One of the more interesting areas of programming languages is whether they allow user-defined structures of existing data types (variant aggregates), and whether they allow user-defined data types. Those are advanced topics space requires leaving for another time.
For now, let’s close with an example bit of LPL-3 code. Here’s the classic Fibonacci routine:
(function[Fib]: (in:nbr) (out:(var:fib)) (if: (< nbr 2) ((set: fib n)) ()) (set: fib (+ (@Fib (- nbr 1)) (@Fib (- nbr 2)))))
If that looks mysterious, here’s a Python version of the same algorithm for comparison:
def Fib (n): if n < 2: return n return Fib(n-1) + Fib(n-2)
Which, I know, is the simple, naive multiple recursion version, but it suffices as a quick example that uses most of the constructs mentioned.
LPL-3 is fairly normal (unsurprising given its Lisp-y inspiration), even usable. Next time I’ll show you LPL-2 and LPL-1, which were deliberate attempts to find a weird language form.