In the last six months I’ve been programming mostly in Haskell, which is a language I thoroughly enjoy working with. The other day, though, a visitor on my blog asked a question about one of my Java solutions of the Coding Bat exercises. Here it is:
seriesUp:
public int[] seriesUp(int n) {
int[] result = new int[n * (n + 1) / 2];
int pos = 0;
int i = 1;
while (i <= n + 1) {
for (int j = 1; j < i; j++) result[pos++] = j;
i++;
}
return result;
}
When looking at this code, my first thought was that this would be so much nicer to write in Haskell, and that it would probably require just one line of code. We’ll be getting there in this article.
It’s probably not immediately clear what this Java function does. Indeed, even after reading the problem description on the Coding Bat website, and knowing that the result is supposed to be an array with a pattern like {1, 1,2, 1,2,3, …, 1,2,3,..,n} for n >= 0, you might have to read the Java code line by line to check whether this really is what the function is doing. To further complicate matters, there is some “magic” thrown in with regards to the length of the resulting array. This could be figured out with some knowledge of series, commonly taught in Discrete Mathematics or Calculus II. But the formula was given in the original problem description. Without it, surely some Java programmers would have felt tempted to initialize either a very large array, or an array list.
In a functional programming language the code would be infinitely more readable. Even as a relative Haskell novice you could write more maintainable code. As you gradually gain more experience, you will write even cleaner code. I will illustrate this idea by repeatedly transforming a piece of Haskell code until the result is an aesthetically pleasing and easily understandable one-liner.
Let’s say you’re just starting out, but you’ve grasped the idea of recursion already. Then, you might come up with something like the following.
seriesUp :: Int -> [Int]
seriesUp n = seriesUp' 0 n
seriesUp' :: Int -> Int -> [Int]
seriesUp' i n | i > n = []
| otherwise = [1..i] ++ seriesUp' (i+1) n
(Note: the syntax highlighting plugin of WordPress does not support Haskell. In a proper editor, this would look quite a bit nicer.)
Some remarks about the syntax are in order. The “pipe” (|) represents so-called guards, which are conditions that are tested sequentially. The second guard, “otherwise”, is merely syntactic sugar. It is internally defined as the boolean value “True”. The “++” stands for list concatenation. Lastly, an expression like [1..n] is a shorthand for a list. For instance, [1..5] is equivalent to [1,2,3,4,5]. If you are wondering about the first line of the function definition: this is an optional type signature. Those are quite handy, and will allow the compiler to catch many errors for you.
The big difference between this first attempt in Haskell and the original Java solution is that it is immediately obvious what the code does, provided you’ve gotten used to the syntax. The code is easier to understand, faster to write and also much more fun to write.
The fun doesn’t end here, though, as the following simplifications show:
seriesUp :: Int -> [Int]
seriesUp = seriesUp' 0
where seriesUp' i n | i <= n = [1..i] ++ seriesUp' (i+1) n
| otherwise = []
This looks quite a bit nicer already! Three changes were made. First, I removed the argument “n” from the initial function call. The concept is called “eta reduction”, and it is one example of the removal of superfluous variables in the lambda calculus. It’s probably clear from looking at the code how it works.
Second, the helper function ‘seriesUp” was put as a local definition into a where-clause. It’s similar to let-expressions, but a bit more concise. Keeping seriesUp’ as a locally defined function makes sense since this function isn’t used anywhere else in the program. Lastly, the order of the guards was switched. Putting the base case of the recursive call at the bottom leads to a performance improvement — not that it would matter in this toy example — since the guards are checked in order from top to bottom.
This isn’t all that can be done. Assume you make it a bit further in your functional programming text book and discover higher-order functions. The relevant higher-order function in this example is map, which applies a function to every element of a list. For instance, type “map sqrt [1..5]” into the Haskell REPL, and you’ll end up with a list of square roots of the integers 1 to 5.
Instead of calling in-build functions like sqrt for taking the square root of a number, you can also call any other function. This turns the previous piece of code turns into:
seriesUp :: Int -> [Int]
seriesUp n = map oneToN [1..n]
where oneToN n = [1..n]
This is a significant improvement, but there is an error in it. If you run this code in the REPL, the compiler will notify you that the type signature doesn’t match the function body. To see what is wrong, just comment out the signature and run the example again. It turns out that the result isn’t a list but a list of lists:
*Main> seriesUp 3
[[1],[1,2],[1,2,3]]
This isn’t quite what we want. However, this error can be corrected easily by calling the in-built function concat for concatenating lists.
seriesUp :: Int -> [Int]
seriesUp n = concat (map oneToN [1..n])
where oneToN n = [1..n]
This leads to the expected result again:
*Main> seriesUp 3
[1,1,2,1,2,3]
At this point the Haskell code is already very concise, clean and highly readable. I’d say it is much superior to the initial version that uses explicit recursion.
The position of the function in map can also be taken by a so-called anonymous function, also referred to as a lambda function. On a side note, James Iry made the humorous remark that Java made lambda expressions popular by not having them. Lambdas are tremendously powerful, and once you’ve gotten used to them, you may find programming in a language that does not have them painful. This might remind you of Paul Graham’s essay Beating the Averages:
As long as our hypothetical Blub programmer is looking down the power continuum, he knows he’s looking down. Languages less powerful than Blub are obviously less powerful, because they’re missing some feature he’s used to. But when our hypothetical Blub programmer looks in the other direction, up the power continuum, he doesn’t realize he’s looking up. What he sees are merely weird languages. He probably considers them about equivalent in power to Blub, but with all this other hairy stuff thrown in as well. Blub is good enough for him, because he thinks in Blub.
When we switch to the point of view of a programmer using any of the languages higher up the power continuum, however, we find that he in turn looks down upon Blub. How can you get anything done in Blub? It doesn’t even have y.
By induction, the only programmers in a position to see all the differences in power between the various languages are those who understand the most powerful one. (This is probably what Eric Raymond meant about Lisp making you a better programmer.) You can’t trust the opinions of the others, because of the Blub paradox: they’re satisfied with whatever language they happen to use, because it dictates the way they think about programs.
Quite recently, though, lambda expressions have finally gained a foothold in the mainstream, with Visual Basic 9.0 (2007), C# 3.0 (2007), C++11 (2011) and Java 8 (2014) supporting them.
After this diversion, let’s see how lambdas can be used in our code. Here’s a first attempt:
seriesUp :: Int -> [Int]
seriesUp n = concat (map (\x -> [1..x]) [1..n])
This is not bad at all! But if this piece of code doesn’t look nice enough yet, you can trade the parentheses for some syntactic sugar.
seriesUp :: Int -> [Int]
seriesUp n = concat $ map (\x -> [1..x]) [1..n]
The dollar sign might look strange at first, but it is common in Haskell programs. The result is one one less non-whitespace character in your code, so this tends to be seen as being more aesthetically pleasing. In many cases it also makes the code easier to read since you will perceive the code to the right of the dollar sign as one block. On the other hand, to many parentheses can easily look messy.
We’re still not done yet. A common theme in functional languages is abstraction. Many frequent operations don’t have to be explicitly programmed. Instead, there is often a standard function that can take care of it. This is true for concatenating the lists that result from our call to ‘map’ as well. That function is called ‘concatMap’.
seriesUp :: Int -> [Int]
seriesUp n = concatMap (\x -> [1..x]) [1..n]
If you’re a seasoned Haskell programmer, then this would probably be the first and last version you write, thus turning the somewhat tedious Java exercise into something rather trivial. Just look at it! Isn’t it beautiful?
With this cute one-liner this series of transformations of the original Haskell solution ends. The purpose of starting with explicit recursion and then successively moving on to ever-higher levels of abstraction was also to illustrate how the code you write will change as you gain more familiarity with functional programming. When starting out, recursion is difficult enough to understand for many people, but this is merely the beginning.
Let’s now have another look at the piece of Java code we started out with. Of course, it is missing the declaration of the main method and its class, which would add a few more lines of boilerplate to the code fragment.
public int[] seriesUp(int n) {
int[] result = new int[n * (n + 1) / 2];
int pos = 0;
int i = 1;
while (i <= n + 1) {
for (int j = 1; j < i; j++) result[pos++] = j;
i++;
}
return result;
Putting the Haskell and Java version right next to each other it is probably obvious why, given the choice, one might prefer to program in a language that allows for higher levels of abstraction. Indeed, I find it peculiar when people call functional programming languages “scary” or complain about their allegedly “weird” syntax. Curly-brackets programmers used to ridicule Lisp for its parentheses, which is an objection I can somewhat understand, even though Lisp code still looks a lot nicer than Java or C. Yet, I’m utterly baffled when Haskell syntax is called “weird” since it is so incredibly clean. Compared to the Haskell one-liner we ended up with, I’m tempted to say that the original Java version is the scary one.