Introduction to Functional Programming with Clojure

Most student programmers do not get introduced to functional programming beyond some extremely academic course using Standard ML that does not really showcase it as a practical programming style for real projects. It is rare that someone learns Haskell or OCaml or Lisp as their introduction to programming as opposed to something like C or Java, despite the fact that it is no more complicated and in my opinion a more intuitive way to think about computation in general. I hope to demonstrate that in reality, functional programming can be a great practical choice for almost any project and is absolutely worth the time it takes to learn and transition from more popular languages.

Why Clojure?

I am using Clojure to showcase these concepts because it was designed from the ground up to be practical above all else, and accomplishes that practicality by embracing functional programming. Furthermore, the language is very “opinionated,” in that all of its design decisions are core to the language, and writing in a non-idiomatic style is heavily discouraged throughout the design. You can look at Clojure’s official rationale for all of the details, but we will be exploring various aspects of it throughout this article.

Another benefit of Clojure its Lisp syntax. Although totally alien to some, it is absolutely worth the effort to learn and get used to. Lisp is a family of languages originating in the 1960s that all have similar prefix notation with heavy use of parenthesis, as you will see. The beauty of Lisp is that its code is just a data structure that the language can interpret natively, also known as being homoiconic. This means that the fundamental syntax in Lisp is extremely bare bones, and can be expanded by libraries and code. It’s a jarring transition, but with a good editor and some time, it becomes hard to go back. As usual, Xkcd puts it best:

Basic Syntax and Functions

The goal of this section isn’t to teach enough Clojure to actually use it, there are plenty of tutorials for that already, but to introduce enough to understand the core syntax and any examples that follow. This is by no means complete or absolutely thorough, but it is surprisingly close to being complete which is the beauty of Lisp. We will not touch at all on Java or the JVM though, which is a very useful aspect of Clojure but not relevant to this discussion. It may seem like a lot of information, but if you keep in mind a comparison to other more popular languages, you will see that the syntax is practically tiny, and the standard library is a fraction of the size while being extremely expressive, if a bit difficult to grasp at first.

The first thing to understand before going any further is data in Clojure, much like in many other functional languages, is immutable. This means that you cannot write something like a for loop which must inherently mutate an index variable. Immutability means you no longer think of code as a series of instructions that change various values, and instead, think of code as a series of functions applied on functions, i.e. a series of transformations on data. These transformations can of course have side-effects, like printing to a terminal, because otherwise your program wouldn’t do anything, but functions with side-effects tend to be kept isolated where possible since it makes code much easier to understand and work with.

The fundamental action of any functional programming is, unsurprisingly, function application. In most programming languages, this is done with parenthesis following the function name, e.g. add(1,3) might return 4. Alternatively, it is common to use infix notation for binary functions, e.g. 1+3 will return 4 or 2 == 3 will return false. Infix usually has implicit operator precedence, so 1+2*3 is 7 whereas (1+2)*3 is 9. In Lisp, all of that is thrown away, and instead function application is done in the form of a list, also called an s-expression when used for function applications or nested data, where the first element is the function and the following elements are its arguments. So the previous examples are, in order: (+ 1 3), (= 2 3), (+ 1 (* 2 3)), and (* (+ 1 2) 3). In exchange for absolutely no ambiguity in function application (among other benefits we will see later), you get lots of parenthesis, with function definitions sometimes ending in 10 or more closing brackets, but that is easy enough to ignore.

Now that 70% of the syntax is out of the way, we can look at the rest. First of all comments use the semicolon symbol. The parenthesis structure represents a linked list. In order to prevent the list from being treated as function application, you can either add a quote, or use the list function: '(1 2 3) or (list 1 2 3). Alternatively if a quote is too much syntax the function quote can be used instead: (quote (1 2 3)). The difference between these two may not be clear at first, but the key is that the quote form prevents an s-expression from being treated as code, whereas list is just a function that takes values as an argument. These are generally not used unless trying to write code to modify other code, i.e. create your own syntax using macros. We will look at some examples of standard library macros shortly.

Instead of using lists to store sequential data, it is more common to use vectors (which are akin to ArrayLists in Java) as follows: [1 2 3]. Some examples of basic operations you would expect from a sequential data structure:

(first [1 2 3]) ;==> 1 
(first ["a" 2 "c"]) ;==> "a"
(second [1 2 3]);==> 2
(pop [1 2 3]) ;==> [2 3]
(nth [1 2 3] 2) ;==> 3
(conj [1 2 3] 4) ;==> [1 2 3 4] 
(rest [1 2 3]) ;==> [2 3]

Note that in all cases commas to separate items are optional and normally excluded except in the case of long multi-line structures. The two other core data structures left in Clojure are the map (dictionary), and set. Maps are used to store key value pairs that can efficiently get a value given its key, and sets are a way to store a collection of values that are all unique. Both sets and maps can be stored in different ways internally, which can be controlled, but for our purposes we will not care about the difference between a hash set and sorted set (for example).

#{"a" "b" "c" 3} ; A set with 4 elements
{"a" 1 "b" 2} ; A map with 2 key value pairs
(get {"a" 1 "b" 2} "b") ;==> 2
(contains? #{1 2 3} 3) ;==> true
;; Maps and sets can both be used as functions as a shorthand
({"a" 1 "b" 2} "b") ;==> 2 
(#{1 2 3} 3) ;==> true

Clojure also includes all of the base data types you would expect, and a few more. There are strings, characters (denoted with a backslash [\a \b \c] is a vector of 3 characters), integers, floats, and booleans. Conversions between types happen implicitly as Clojure heavily embraces polymorphism, so using a function like first on a string returns a character. When converting to a boolean, every single value is considered truthy except for nil and false, including empty lists.

Clojure (and most Lisps) also support something called keywords, which are kind of like special strings used as keys in maps, denoted with a colon :mykeyword. The reason they are used as keys in maps is because they are implicitly converted to a function for accessing that element in a map. For example, (:a {:a "hi" :b "hello":c "hey"}) would return "hi", but that would not work if the keys were something other than keywords.

The penultimate data type I am going to mention is one that has appeared a number of times already, and it is the symbol. Every function name we have seen so far like first is just a symbol referencing that function. These are effectively like pointers in C, or better yet they can be thought of as variable names. Like keywords, they are strings with special meaning attached to them, and can be converted to strings with the str function, and the other way around with the symbol function. By default, anytime you use a symbol, it will get replaced by the thing it references, which is why these functions actually do something. In order to prevent that and actually work with the symbols (which is something you rarely need to do), the quote macro is used just like with lists, e.g. 'first or (quote first).

Finally and perhaps most importantly, there are functions. These are created using the fn macro, which takes a vector of symbols, which are the functions inputs, and then an s-expression that uses those symbols to do something. Note that fn itself is not a function per se, but a macro, meaning it takes all of its arguments as a data structure and operates on that code directly, which is why you can use variables inside the expression without them being directly evaluated. This happens at compile time, so all of these macros that are everywhere in Lisp will rearrange your code before the compiler sees it. This is a somewhat strange concept outside of Lisp, but it is how most more complex libraries are built up into programmer friendly interfaces, and how complex concepts can be expressed in such a bare bones syntax. It is like a much more powerful version of C preprocessor macros, made possible thanks to Lisp’s homoiconicity. Conceptually this may be a bit confusing, but in practice its extremely simple and looks the same as using any old function, so let’s see some examples.

(fn [a] (+ a 1)) ; Increments input by 1, equivalent to 'inc'
((fn [x] (* x x)) 3) ; Defines the function and applies it, ==> 9
(fn [myvec] 
  (conj myvec (first myvec))) ; Copies first element to end of vec
;; Note above fn does not modify myvec, but returns a new vector

Notice the consistency of the syntax here: it does not matter whether a function is named or not, putting it as the first element of an s-expression applies it regardless. That is what makes Lisp so great, it is fundamentally so simple (to paraphrase Clojure’s creator Rich Hickey, “simple but not necessarily easy”) that everything works exactly as you expect.

The last remaining thing to point out is that you can globally bind data (which includes functions) to a symbol using def. There are also ways to locally bind symbols (namely let binding) but we won’t concern ourselves with that for now. The defn macro combines def and fn into one form since defining functions is done so often.

(def one 1) ;;from now on in this file one will refer to the value 1
(def squared (fn [x] (* x x))) 
(squared one) ;; evaluates to 1

(defn cubed [x] (* x x x))
;; The compiler "sees" the above as (def cubed (fn [x] ...))

Functional Programming

The key difference in all of this syntax compared to more common languages is that there is no idiomatic way to do one action followed by another, since data is immutable (there are constructs in the language to do that, specifically for handling functions with side effects like println ). Instead, programming should be thought of as a series of transformations of data. Lets say you want to double every element in an array. In a language like C, you would write a for loop that goes through each element of the array and multiplies it by 2, and then modifying the array with the result. If you then later use that array in some function, you must keep in your head as the programmer that at some time during the programs execution, that array’s meaning and value changed. Furthermore, in the process, a new variable needed to be declared to keep track of the current iteration in the loop. Sure, it has no performance impact and can be easily optimized away, but it’s extra code that does not have anything to do with the programmers actual goal.

int myarray[5] = {0,1,2,3,4};
printarray(myarray); // pretend someone made an array print
for(int i = 0; i < 5; i++) {
    myarray[i] = 2 * myarray[i];
}
printarray(myarray); // Now this function has a different effect

Clojure (and basically every functional language) has a function called map that takes as input a function that takes 1 parameter, and a list. It then returns a list with that function applied to every element of the input list. Let’s see the Clojure version of the above code. Note that it is assumed that all of this code is put into the Clojure REPL (Read-Eval-Print Loop, much like the one python has, as well as terminals like Bash), since normally Clojure code is not just run sequentially like an imperative language would be.

(def myarray [0 1 2 3 4])
(println myarray) ; Prints [0 1 2 3 4]
(def doubled 
  (map (fn [x] (* 2 x)) myarray))
(println myarray) ; Still [0 1 2 3 4]
(println doubled) ; Prints [0 2 4 6 8]

map is a much weaker operation than the for loop, which is exactly why its so great. It inherently encodes the idea of taking an operation and using it identically on every element of a sequence, which is one of the most common uses of a loop anyway.

If you look back, you may notice that the operation of multiplying every element of an array by 2 could be made parallel easily, since no part of the computation depends on another. In fact, the map function in general satisfies that same property. Therefore, Clojure has a function called pmap, which is exactly like map but runs in parallel (this includes some overhead for collating all of the results together, so it runs slower on small lists). The above code can be parallelized with exactly one character due to the nature of the transformation we wanted to perform! For loops on the other hand could potentially depend on former results because they are stateful, so any attempt at parallelization will likely be somewhat more complex.

Constraining the amount you can do with a single function or construct is core to what makes functional programming practical. Complicated algorithms get reduced down to something where each step is mostly self explanatory. It’s all about how the data gets manipulated, rather than manipulating the data. Of course every once in a while state may actually be the easiest way to solve a problem, and Clojure provides ways to handle that (e.g. atoms, which can be very helpful for complex concurrent programs) if you need it, it’s just that you usually don’t need it. In fact, just to keep you honest, every single standard library function that involves state in any way ends with a ‘!’ to help avoid bugs that so often come with using stateful functions.


Of course, if programming should be thought of as a series of transformations, then there should be a good way to do many transformations in a row that isn’t as ugly or painful to read as just repeated function application. To that end, I want to introduce a fun little set of macros included in the standard library called threading macros, of which there are a handful that are all fundamentally the same. We will just look at the simplest one, which is called thread last, and written as ->> . It doesn’t actually do anything special, or provide you with more power, but it does provide you with “new syntax” as it effectively rearranges your code before the compiler sees it, allowing you as the programmer to read and write code that best represents the actual intention behind the code without being trapped by the very specific structure a simple Lisp program might have. ->> takes as its first input some data, and then every other input is an s-expression. The macro will then rearrange the code (BEFORE the s-expressions get evaluated) such that the initial piece of data gets put as the last argument of the next s-expression, and the result of that gets put into the last argument of the next one, etc. It is best understood with an example:

(defn tax-transcations [purchases] 
  (apply + 
         (map (fn [x] (* x 0.2)) 
              (map second purchases)))

(defn thread-tax-transactions [purchases]
  (->> purchases
       (map second)
       (map (fn [x] (* x 0.20)))
       (apply +)))

(tax-transactions [["laptop" 1000] ["mouse" 30] 
                   ["keyboard" 100] ["headphones" 100]]) ;==> 246

Both of these functions are exactly identical, they just take a list of lists, where each sub list represents an item and its cost, and return the total tax (assuming 20% tax rate) on all of the items. It just takes the second element of each list, thereby transforming a list of lists into a single list, then multiplies every number by 0.2, and then sums them (apply just turns a function that takes multiple arguments like +, and turns it into a function that takes a single argument that is a list). The big difference is that the bottom one is much more readable, and effectively represents the exact process going through my head as I was writing the algorithm, whereas the first version requires reading bottom to top, right to left, which is unnatural both to read and to write. Being able to make decisions about syntax like that is the power of Lisp macros.


Let’s look at a more complex example, using some functions that have yet to be introduced. Merge sort seems like a nice place to start, since it is an extremely common algorithm used everywhere, that is especially efficient with linked lists which are commonly used in Clojure. For those unfamiliar with merge sort, it is a sorting algorithm with O(nlogn) asymptotic time complexity that works by using a subroutine that takes two already sorted lists and merges them together into a single sorted list. The input list is split in half, and then each half is sorted by merge sort (recursively), and then those two sorted lists are merged again. When reading the example, make sure to pay attention to indentation more so than then parenthesis, since it is much easier to read that way. The many levels of nesting in Lisp style braces can mostly be ignored if the code is formatted sensibly which most sane people will do.

(defn mrg [[x & xrest :as X] [y & yrest :as Y] R]
  (if (and (not-empty X) (not-empty Y))
    (if (<= x y)
      (mrg xrest Y (conj R x))
      (mrg X yrest (conj R y)))
    (concat R X Y)))

(defn mrgsrt [X]
  (if (> (count X) 1)
    (let [sides (split-at (/ (count X) 2) X)]
      (mrg (mrgsrt (get sides 0)) (mrgsrt (get sides 1)) []))
  X))

This example includes plenty of functions from the standard library that have yet to be introduced, but should be simple to understand by their name alone. This example also introduces a feature common in functional programming called parameter deconstructing which allows you to express a data structure parameter in terms of its components. In this case, the function mrg takes three lists as a parameter, X Y and R but the first two lists are separated into the first element ( x and y respectively) and a list of remaining elements ( xrest and yrest ). This is just a shorthand that makes code much easier to read and understand, as well as putting an emphasis on the structure and processing of data as opposed to the algorithm itself!

The mrg function assumes that X and Y are both sorted, so it works by separating out the first element of each and checking which one is smallest. It then recursively calls itself with the same inputs except the smallest element has been removed from the corresponding list and added onto the result list. It effectively is going through and picking off the smallest elements of each list and adding them into a result, and when one of the lists is empty it just puts the remaining list onto the result and returns that. Fundamentally a very simple algorithm, but implemented quite differently than you might using C++ where you would need to constantly keep track of where in each list you are, and modify a results variable constantly. mrg avoids having state by having more parameters, thereby maintaining its purity as a function. This means you can really easily test the function just by popping open a REPL and running mrg with whatever parameters and ‘state’ you want. mrgsort then just splits its input list in half, then runs mrg on each half after sorting those recursively, and will just return a length 0 or 1 list as is since they are already sorted.

Notice how Clojure uses functions like split-at and count that can affect practically any collection in Clojure. It can be used on lists, strings, and maps polymorphically in a way that just makes sense and is simple, whereas in an object oriented language you would see functions like split be part of the string class and will only work on strings or classes derived from strings. The simplicity of data structures in Clojure grants polymorphism for free. In fact, this merge sort function will work on any list, vector, or even set of numbers, noting that the set and vector will be turned into a list (although they can easily be returned back into a set or vector with the into function). If the <= function used in mrg (which is only valid for numbers) is replaced with the compare function, then our sort will work with any core Clojure data structure (including hash maps which implicitly convert to a list of length 2 vectors) containing any comparable data type such as strings and vectors. All of that polymorphism comes basically for free because every data structure in Clojure is just built off of the basic few, so any function that applies to the core data structures will also apply to any of your own without writing any boiler plate or even thinking about it very much.

Show Comments