Mindshift: Part 13

Using Functions to Create a Simple List

We have now covered two keys areas in fuctional programming:

  1. The ability to use a PAIR function to form a pair of values
  2. The ability to call a function recursively

We can now use these tools to construct and process simple linked lists. The type of list we will create here is a known as a "singly" or "one-way" linked list. This type of list is one that can only support forward traversal through its items. I.E. once you have moved from item one to item two in the list, you lose the ability to traverse back to any previous item.

However, for all its limitations, we can use this construct to implement more familiar data structures such as character strings and arrays.

In order to perform this implementation however, we must adopt some simple conventions.

  1. Each item in the list will be composed of a pair of values
  2. The first element of any given pair will hold that list item's value
  3. The second element of any given pair will hold another pair that represents the rest of the list
  4. We will arbitrarily create a list terminator identified by the fact that both items in the pair are set to JavaScript null

This results in our list actually being implemented as a tree in which the left branch only ever holds a leaf node containing the data value, and the right branch holds either the remainder of the list, or the list terminator.

So knowing that all items in a list are just pairs of values, we can access each item using only two functions. The names of these accessor functions vary from language to language, but here we will use the names HEAD and TAIL. Some alternative names for the first item of a pair are FIRST, LEFT or CAR, and for the second item of the pair, SECOND, REST, RIGHT or CDR.

ASIDE

One thing that I find irritating about the way lists in functional programming languages are "explained" is that the LISP function names CAR and CDR are often used with the assumption that people already know exactly what these functions do; therefore, no further explanation is needed. Then to add further insult to this injury, you are usually made to feel stupid if you need to ask what these terms mean.

The intellectual smugness caused by this high-mindedness is usually based on the false assumption that being uninformed somehow makes you unintelligent (and therefore worthy of derision?) But since this reasoning is completely false, I see no point in using unexplained, non-intuitive function names such as CAR and CDR when easy-to-understand names such as HEAD and TAIL will do perfectly well.

However, in order to remove any possibility for intellectual smugness, let me talk to you as an intelligent (but possibly uninformed) person and explain what the terms CAR and CDR actually mean.

In 1954, IBM launched the "Type 704 Electronic Data-Processing Machine" which at the time, was the only computer capable of handling complex mathematical calculations. Therefore, it was for this platform that two of the first high level programming languages were developed - FORTRAN in 1957 and LISP in 1958.

The IBM 704 used two types of instruction, A and B. Type A instructions were 36 bits long and were composed of a 3-bit prefix (instruction code), a 15-bit decrement field, a 3-bit tag field, and a 15-bit address field.

The term CAR was the IBM 704 assembly instruction for accessing the "Contents of the Address part of the Register". Similarly, CDR is the assembly macro for accessing the "Contents of the Decrement part of the Register".

LISP used the address and decrement parts of the register to point respectively to the current and next elements of something called a CONS cell (I.E. one element in a singly linked list that here we identify as a PAIR).

The names of the assembly instructions CAR and CDR were then adopted directly into LISP.

In this blog, I will use the function names HEAD and TAIL instead of CAR and CDR. Once you get used to these names, you will immediately understand their use in other functional programming languages such as Erlang.

The name TAIL is being used because the second item in our PAIR function will contain all items in the list except the current one. You could also use the name REST as an intuitive synonym for TAIL.

Making a list, but not checking it twice

Unlike Old Saint Nick, when we make a list, we will make it in such a way that we know it will be right; therefore, we won't need to check it. So, to make a list, we will need a PAIR function, and to access the data in the list, at least a HEAD and a TAIL function.

We must also create functions that can create and then identify the terminator item.

Now when we create a list, we will start with an EMPTY, or terminator item and then add items by nesting the previous list item into each subsequent pair as the second item.

Well, OK. But there are two problems that make our current list implementation particularly user unfriendly:

  1. Its pretty tedious to have to add characters one at a time
  2. The character order of the string held in the list has been reversed

So, we'd better fix that.

There and Back Again: A Hobbit's (Functional Programming) Tale

So let's write a pair of functions that can transform a basic JavaScript datatype such as a character string or an array into a list and back again.

Here, I will jump straight to the finished functions rather than show their derivation.

Since the functions that convert abstract lists into native JavaScript data types use recursion (via the Y-Combinator), we need a function that will prevent runaway recursion.

Therefore, in addition to the is_empty function, we need another function called is_last. is_last takes the current list item and passes the second element to is_empty. Function is_last returns whatever value is returned from is_empty.

In keeping with our established naming convention, the function names is_empty and is_last use all lowercase letters because they return JavaScript native datatypes. In the next blog, there will be a second version of these functions called IS_LAST and IS_EMPTY in all uppercase letters. The use of all caps function names denotes that they operate on, and return abstract functions instead of JavaScript native data types.

Strings to Lists and Back Again

Arrays to Lists and Back Again

Now, let's create a pair of functions that transform a list into a JavaScript array and back again.

The actual implementation of this particular function requires that we overcome a quirk in the JavaScript language concerning the value returned by certain array functions such as push or unshift.

We will need to use the JavaScript function Array.prototype.push to insert the current list item as a new element into the array. However, the function Array.prototype.push does not return the newly modified array; instead, it returns the index of the newly inserted item - which is useful to neither man nor beast...

So here, we must create our own version of the Array.prototype.push function that returns the modified array instead of the index value.

This can be done simply enough by realising that JavaScript arrow functions do not create their own execution contexts; they execute entirely in the context of the calling function. Therefore, if we pass the result of a normal Array.prototype.push function as a parameter to an inner function, that function can ignore its parameter value (the index value of the newly inserted item) and return the modified array.

We can now use our own version of the push function to construct an array.

So there we have a basic implementation of a list. We can now take native JavaScript data types such as string or array and convert them into lists, and we can then convert these lists back into the native JavaScript data types.

In the next blog, we'll look at how we can implement the classic map function on a list.

Chris W


All source code from the Mindshift blog series can be found on GitHub