Category Archives: Education

Review: Principles of Computing — Coursera

I’ve been eagerly awaiting the follow-up course to Rice University’s An Introduction to Interactive Programming in Python. While that course offered a playful introduction to computer science and programming through building a number of successively more complicated games, Principles of Computing attempted to improve your programming skills by embedding discrete mathematics problems and basic algorithms into, again, assignments that consisted of writing a number of games. Unlike the predecessor course, you didn’t have to bother with GUI code, but could instead fully focus on the underlying game logic.

On the mathematical side, Principles of Computing covered arithmetic sums, basic functions in order to describe growth rates, basic probability and a bit of combinatorics. Some assignments focused on algorithms, including standard topics like searching and sorting. Breadth-first search was covered in a rather amusing assignment where you had to write logic for ‘humans’ that would make them avoid slowly moving ‘zombies’. Further, minimax was covered in a disappointingly short assignment that was merely a variation of a previous one, an AI for Tic-Tac-Toe. The assignments themselves were the highlight of the course, though, covering the logic behind games such as 2048, Cookie Clicker, Yahtzee, the just mentioned Tic-Tac-Toe (Monte Carlo and tree search), and the 15-puzzle. Overall, they helped solidify the concepts.

While I thought that An Introduction to Interactive Programming in Python, which I took when it was offered the first time, was very polished, the same can’t unfortunately be said for Principles of Computing. Let’s me preface my subsequent criticism by stating that Principles of Computing is an excellent course, and easily comparable to the kind of course you would take at a brick-and-mortar university. While it isn’t as polished as its predecessor, it is nonetheless, in my opinion, one of the best-designed MOOCs in computer science you can take.

The video presentations are engaging. Production values seem to have increased over An Introduction to Interactive Programming in Python, which had videos that were seemingly captured by a webcam. Overall, the course material is presented very well. Some may find that the videos are moving a bit too slowly at times, but I prefer that approach over lecturers who rush through the material or even omit key concepts. Further, there was a lot of support available, such as in the form of relatively extensive notes on key topics. The discussion forums were not overly active, but it seemed that basically any question got answered quickly, either by other students or by the instructors themselves.

What stood out negatively, though, was a certain sloppiness in the presentation. In particular, the descriptions of the assignments were often lacking. There were several instances were parts of the assignment texts had to be revised by the instructors, after students posted that they were ambiguous or left out important information. In one assignment there were several typos. In others, important information was missing, or given in a rather misleading way. This was arguably one of the reasons for the tremendous rate of attrition. To quantify this impression: the first two assignments led to 19 pages of discussions, while the last few only had five or six pages.

The teaching language of this course was Python. Unfortunately, almost all starting templates prescribed OOP, and due to the way your programs were tested, you couldn’t deviate from that paradigm. In many cases a functional solution would have been much more straightforward, but that’s probably a general statement of OOP vs. FP. While I think that Python is a fine language for budding programmers, I don’t think it’s the best choice when writing more complex programs. For Principles of Computing, Python worked well, apart from the very last assignment, a solver for the 15-puzzle, which was several hundred lines long.

Due to the missing type system of Python that task felt like building a house of cards. You’re forced to write assertions and tests for basically anything. In a statically typed language you could have done with a fraction of the tests, and even with a rudimentary type system like Java’s, you would have been better off, with the downside that its lack of tuple assignment would have blown up the code by a factor of two or so. I’m curious to find out how long an implementation in Haskell would be. You could probably do it in half as many lines, and with just a few property-based tests, instead of a myriad of unit tests.

One last problem, which was voiced repeatedly on the forum, was that in the assignment description students were repeatedly told that, “once you’re confident that your methods are correct, use [the online unit test suite for this course], to confirm that they are correct.” Unfortunately, the unit tests provided did not cover all corner cases, so you certainly couldn’t confirm that your code is correct. If you were unlucky, you discovered that the very first method you’ve written, a check of a basic invariant, was actually incorrect, as you were working on the very last method, which called the various solvers depending on the state of the board. Some students made the spurious argument that those missing unit tests were “probably intended to make us write our own tests”. I didn’t find this very convincing. If that had really been the case, then state that the provided unit tests are lacking, but don’t tell students that they can use the provided unit tests to confirm the correctness (!) of their code, when that’s simply not possible.

Principles of Computing offers two kinds of statements: a regular one, and one “with distinction”. For the latter, you would have to average 90% over the entire course, and get 70% on the last assignment. As indicated by the description above, that assignment, the solver for the 15-puzzle, was rather complex, and a worthy criterion for a statement of achievement with distinction. Please note that your program is expected to solve this problem in full generality, meaning that it should be able to process any m * n puzzle. It is indeed a lot more involved than it might look.

Apart from a few minor problems, I found Principles of Computing to be tremendously enjoyable, and I warmly recommend it. The few issues I pointed out will most likely be eradicated in the second iteration, assuming that the same assignments will be used.

Replicating a BSc in Computer Science through MOOCs

Introduction

Two years into the MOOC revolution we’re now at a stage where the content of entire degree-programs can be found online. Of course, going through MOOCs won’t get you any paper certificates, even though you can certainly pay for virtual proofs of your identity if you are so inclined. Coursera offers a “Signature Track”, and EdX recently added a similar option to a number of courses. You can acquire the knowledge, however.

I’m not suggesting that MOOCs can be a replacement for the entire college experience, though. You’ll miss out on a few aspects, such as interacting with your professors, but in any large university you won’t interact much with your professors either. Further, there is no equivalent of writing a thesis, or getting involved with research. Apart from those limitations MOOCs are a dream come true for an autodidact.

I was wondering whether it was possible to follow a typical CS curriculum solely with (free) online courses, and it turned out that it is largely possible. There is also a paid-for option through the University of London, but their Computing and Information Systems BSc, taught by Goldsmiths, comes with a sticker price of close to GBP 5,000. So, let’s say you don’t want to spend that much money, or much more on a traditional college degree in the UK or US. How far could you get?

Please note that I won’t bother with any “general education” requirements. In Europe 3-year BSc programs are the norm, and you primarily focus on your major subject. In the US, though, you’ve got to endure roughly a year’s worth of courses that are entirely unrelated to the subject you’re actually studying. They are proclaimed to turn you into a well-round person. This is pretty much just marketing speak. What most commonly happens is that people pick courses where they hope to get an easy “A” in, and most students don’t take them seriously anyway. A case in point is the Harvard Cheating scandal of 2012. Oh, and I certainly would feel ripped-off if I were forced to pay several thousand dollars for the doubtful privilege of taking a survey course on popular music. This isn’t an essay on the shady sides of US higher education, though, so I won’t dwell on that topic any longer.

Computer science historically grew out of either engineering or mathematics, and therefore there are differences between programs. Some have a stronger focus on CS theory and mathematics, others include many courses related to electrical engineering in their curriculum. I’ll instead focus on what I consider to be “core CS” courses, and I will highlight some specializations that are well-represented online. CS is not applied IT, though, so anything in the vein of Codecademy doesn’t quite qualify for this article. Some typical courses you won’t find on Coursera et al. yet. However, I’ll point to alternative resources in that case.

Introductory Programming and CS

There is a wealth of introductory programming courses. I think there are benefits in beginning with a functional programming language, which entails a much reduced level of artificial complexity, and the fact that it’s much easier to reason about programs without mutation. A very good first course would therefore be Introduction to Systematic Program Design – Part 1 by Gregor Kiczales, which uses the Lisp dialect Racket. Part II is planned. Those courses are based on the classic introductory CS textbook How to Design Programs by Felleisen et al. I don’t like that book due to its slow pacing. Prof. Kiczales’ course is much more digestible, though.

You’ll probably want to pick up a more mainstream language such as Python or Java. For Python, I would have recommended Udacity’s CS 101 a year ago. That course used Python. A problem with the Udacity platform is that the forums are a wasteland. I don’t like some of their forced attempts at involving students, either. For instance, in CS 101 they asked students to submit exercise questions themselves, and probably in a misguided attempt to be “inclusive” they put exercises online that show very poor style, such as printing a value instead of returning it. Some other student’s exercise, which is likewise part of a set of optional problems for CS101, has a bug I reported over half a year ago in the forum. There has been no response to it at all. However, there are now about a dozen threads of students who are confused by the original problem because their (correct) solution does not take that particular bug into account.

Udacity also offers an introductory Programming in Java course, which I haven’t gone through. It’s probably okay. If you can motivate yourself, I’d recommend Stanford’s CS106A: Programming Methodology for self-study. Mehran Sahami is a fabulous lecturer, and his course is very thorough. I taught myself Java with it, and Allen Downey’s free textbook Think Java.

If Udacity’s CS 101 is not so great, then what’s the alternative if you want to learn Python? I think it’s Rice University’s An Introduction to Interactive Programming in Python. You’ll build increasingly complex games, and by the end of the class you’ll have written between 1,000 and 2,000 lines of code, which will get you a lot of practice. It’s an entertaining class, which I’d recommend even for guys with some years of experience, particularly if you’ve never built simple games.

Those courses will teach you enough programming skills. They should be followed with a course on datastructures and algorithms, i.e. a traditional second course in CS. Unfortunately, the most thorough treatment of that topic is taught in Java: Algorithms (Part I, Part II) by Kevin Wayne and Robert Sedgewick. It would be preferable if an algorithms course was taught in either a language-agnostic way, or in a more expressive language. Tim Roughgarden’s excellent Algorithms: Design and Analysis (Part 1, Part 2) is language-agnostic. This course also includes a review of important data structures. However, to get the most out of these courses you’ll need some mathematical maturity.

Mathematics

It seems that there is relatively little use for continuous mathematics within CS. Calculus is nonetheless commonly taught, which is arguably due to historic reasons. I don’t think you could make a good utilitarian argument for studying calculus within CS. However, you could easily argue that a certain degree of mathematical maturity makes you a lot smarter. You’ll certainly be less impressed by mainstream media information if you know a bit of statistics and probability.

If you didn’t take calculus in high school, then I’d recommend two fun introductory classes from Ohio State University: Calculus One and Calculus Two. Calculus One is an introductory course on single-variable calculus. The presentation of the material is a bit different from what you might be used to in mathematics, but I don’t mind the popularization of the material. For instance, I had no idea what a grain elevator was when I first encountered that term in calculus problems, so I appreciate that Jim Fowler and Bart Snapp use examples you can more easily relate to. Calculus Two covers series. If you like mathematics, you’ll probably enjoy it, but I view the material as mostly optional. You’ll come across telescoping series in some analyses of algorithms, though.

A much more important topic is discrete mathematics. Unfortunately, a MOOC on that topic is still missing. Thankfully, MIT OCW offers a great course in Mathematics for Computer Science, with fabulous lecture notes. There are several versions available. The Fall 2010 version has video lectures, while the Spring 2010 version comes with a very good set of lecture notes.

Lastly, there is linear algebra. I did have high hopes for Coding the Matrix: Linear Algebra through Computer Science Applications by Philip Klein. Unfortunately, this course was an utter disappointment. There were countless technical problems, and occasionally poorly worded questions in the assignments. It was not uncommon that I spent more time trying to please the autograder than actually solving the programming exercises. I also remember one particularly unpleasant Saturday afternoon where I was puzzling over autograder feedback, only to later learn that there was a problem with the grading script that rejected correct solutions. I hope that those issues will eventually get sorted out. An even bigger problem, though, was that the lectures weren’t very good. Philip Klein literally read the dense text on the slides to you, line by line. This was arguably the worst presentation of the roughly two dozen MOOCs I’ve either audited or completed. (I did earn a statement of accomplishment, in case you are wondering, but it was a real drag.)

The big draw of Coding the Matrix, computer science applications, turned out to be much less exciting in practice. You’d work on toy problems that illustrate, say, Hamming codes or image transformations, but the scale was so small that you walked away being thoroughly unimpressed. Of course, we were using a slow interpreted language like Python, and working on small problems. I would have much preferred to have been properly exposed to linear algebra, and then shown realistic applications. Alternatively one could have used highly-performant libraries so that you could have solved moderately sized problems.

EdX has an upcoming course that seems to move more towards that direction, though:
Linear Algebra – Foundations to Frontiers. Then there is also Linear Algebra on MIT OCW with fabulous lectures by Gilbert Strang. He is an enthusiastic lecturer, and he develops the material properly, which makes it easy to follow the material. A further bonus is that he made the linear algebra textbook he wrote freely available as a PDF. However, going through a course on MIT OCW might require more motivation and determination since there are no fixed deadlines, and no automatically graded exercises or exams.

If you’re fine with a less traditional way of teaching mathematics, you could also make use of Khan Academy, which covers calculus, statistics and probability, as well as linear algebra. There is currently very little discrete mathematics on offer, though.

Now we’ve got basic CS, programming, data structures and algorithms covered. There is only one course missing to complete the content of a typical CS minor.

Systems

To round off your basic CS education, one course in “systems” should be added to the mix. Such courses seem to be much more common in traditional CS programs that grew out of engineering departments, while they are either electives or wholly absent in other CS programs.

I’ll admit that I have a poor background in systems, with only one project-based university course under my belt that I didn’t consider thorough enough. This is therefore an area I intend to explore further. The two most interesting options seem to be by the University of Washington and the University of Texas, Austin. I didn’t have enough spare time when it was first offered, but I had a look at the materials, and I got a very good first impression of the University of Washington course The Hardware/Software Interface. A related course is the upcoming EdX offering Embedded Systems – Shape The World.

Specializations

With the requirements of a CS minor out of the way, what would you want to go on to study? I’m quite amazed at the wealth of offerings. Of course you won’t find any cutting edge research seminars online. If you’re serious about CS research, then MOOCs are only a poor substitute, but most of what you’d find in a typical taught BSc or MSc program, as opposed to a research-based one, you can find online as well.

If you’re interested in knowledge that is more intermediately useful, pick Jennifer Widom’s thorough Intro to Databases course. It covers theory, and also a lot of practice. For anyone only wanting to learn a bit of SQL it’s overkill, though.

If networks are what interests you, then you can start by taking the University of Washington’s Computer Networks, followed by Georgia Tech’s Software Defined Networking.

Are you interested in learning more about programming languages and their implementations? In this case, there is a wealth of resources available, too. Peter Van Roy of the University of Louvain, author of Concepts, Techniques, and Models of Computer Programming, is going to offer a course on programming paradigms on EdX. You could follow this up with Dan Grossman’s fabulous course on Programming Languages. That course focuses on the elements of programming languages. A good complement to Dan Grossman’s course is Wesley Weimer’s Programming Languages: Building a Web Browser, which gives you a good foundation for a course in compilers. Wesley Weimer is another one of my favorite lecturers, by the way.

Computer science legend Jeff Ullman is about to offer his course on automata theory for the second time on Coursera. His colleague Alex Aiken teaches one on Compilers. This is another one of the courses I have not taken yet. The syllabus looks quite intimidating, though. It has one of the highest estimates for weekly workload of any course on Coursera, 10 to 20 hours, and judging from feedback on the web, it’s pretty accurate.

A hot topic at the moment is Machine Learning. Coursera lists courses from Stanford, the University of Toronto, and the University of Washington. EdX offers the Caltech course Learning from Data, which has a reputation for being the most rigorous online course in ML.

Traditional AI seems to have taken a backseat compared to Machine Learning in recent years, but it nonetheless has a strong online representation. Udacity now hosts the seminal “AI Class”, Introduction to Artificial Intelligence, taught by Peter Norvig and Sebastian Thrun. Sebastian Thrun also teaches a more advanced class on self-driving cars: Artificial Intelligence for Robotics. Alternatively, you could take the UC Berkeley course Artificial Intelligence on EdX.

Conclusion

While I’ve given an overview of core CS courses and a few specializations, there is a lot more you could learn. There are courses on scientific computing, computer architecture, cryptography, computer graphics, computational investing, parallel programming, and even on computational investing and quantum computing. The list goes on and on. I do miss a course on operating systems, though.

I merely wanted to highlight some of the larger areas of computer science, and show that they are already well-represented online. The selection is largely based on my personal interests. Still, I think my presentation convincingly conveyed that there is, a mere two years after the MOOC revolution started, an absolutely staggering amount of MOOCs available. Just look at the numbers of CS courses, generously interpreted, that are currently listed on the websites of the major providers! EdX counts a total of 18, and so does Udacity, incidentally. Coursera, on the other hand, lists 91. This is as total of 127 courses in CS, and this is not taking into account the many courses that are tangentially related to CS, like mathematics, or statistics and data analysis.

Review: Algorithms: Design and Analysis, Part 1 — Coursera

Udacity’s Algorithms: Crunching Social Networks is a neat course, but does focus heavily on graphs, as the title suggests. I was therefore looking for a more thorough treatment of algorithms, and Tim Roughgarden’s Coursera course Algorithms: Design and Analysis, Part 1 provided exactly that. I originally intended to write a review after finishing part 2, but there was so much content in the first part already that I dropped that idea.

Algorithms: Design and Analysis consisted of, as Prof. Roughgarden put it, “a selection of greatest hits of computer science.” It’s material any programmer or computer scientist should be familiar with. It is relevant whenever you work with algorithms and data structures, and also satisfying to study.

Here is the entire curriculum:

I. INTRODUCTION (Week 1)
II. ASYMPTOTIC ANALYSIS (Week 1)
III. DIVIDE & CONQUER ALGORITHMS (Week 1)
IV. THE MASTER METHOD (Week 2)
V. QUICKSORT – ALGORITHM (Week 2)
VI. QUICKSORT – ANALYSIS (Week 2)
VII. PROBABILITY REVIEW (Weeks 2-3)
VIII. LINEAR-TIME SELECTION (Week 3)
IX. GRAPHS AND THE CONTRACTION ALGORITHM (Week 3)
X. GRAPH SEARCH AND CONNECTIVITY (Week 4)
XI. DIJKSTRA’S SHORTEST-PATH ALGORITHM (Week 5)
XII. HEAPS (Week 5)
XIII. BALANCED BINARY SEARCH TREES (Week 5)
XIV. HASHING: THE BASICS (Week 6)
XV. UNIVERSAL HASHING (Week 6)
XV. BLOOM FILTERS (Week 6)

I particularly enjoyed the lectures on the master method, since they cleared up a few things for me. Some time ago I was working with matrix multiplication. From linear algebra I knew that this should lead to cubic runtime. However, my impression was that the algorithm ran faster than that, so I did some research and found out about Strassen’s algorithm. This one was covered in the lectures as well. What I viewed as mysterious back then was not only how Strassen came up with it in the first place — those strokes of genius are rarely explained — but also how one could make a statement as precise as that the algorithm was in O(n ^ 2.8074). Well, thanks to the master method I know now.

All the topics listed above you can find in your typical algorithms textbook. What you don’t get when working through a textbook, however, is the fabulous presentation of the material. The lectures are full of proofs and serious discussions, but Prof. Roughgarden knows how to keep your attention with a dry remark, or by quickly traversing language levels. In one second he’s speak of the raison d’être of an algorithm, and in the next he advises you “to bust out the Pythagorean Theorem” for one part of a proof. At that point I did rewind the video because I thought there was no way he could have said that.

The lectures overall were surprisingly entertaining. This was particularly the case whenever Prof. Roughgarden was discussing implications of analyses or the “big picture”. Here is a good example, taken from a brief lecture that discussed the necessity of knowing your data structures and their respective features well:

Levels of knowledge regarding data structures

Levels of knowledge regarding data structures

Level 1 was, according to Prof. Roughgarden, “cocktail party conversation competence, but of course I am only talking of the nerdiest of cocktail parties”. The sense of humor may not be to your liking, but if it is, you’ll be in for a treat. I don’t think I ever enjoyed listening to a lecturer in a technical subject that much.

Let’s talk some more about the presentation. I have gone through courses, both online and on-campus, that presented the material as if it had been received from God in its final form, with hardly any motivation or explanation, and instead just a litany of formulae and definitions. Prof. Roughgarden eventually also ends up with lengthy equations and formal definitions, but he develops the material as he goes along, not unlike Salman Kahn does it in his mathematics videos at Khan Academy. Here is a slide from one of the lectures on Dijkstra’s algorithm to illustrate this:

Dijkstra in color

Dijkstra in color

The many colors give a hint at the stages this drawing went through, but if this is too hectic for you, you can download the lecture slides as PDFs as well. Even typed versions were provided. Occasionally, there were optional PDFs with lengthy formal proofs for those with a greater interest in theory.

If you now said that this sounds as if the presentation was a bit whimsical, then I would agree with you. However, this does not mean that Algorithms: Design and Analysis wasn’t a thorough course. The problem sets and programming assignments required you to have a solid grasp on the material from the lectures. In particular the programming assignments required a good understanding of not only your programming language of choice but also of the algorithms and their supporting data structures. The difference between a good and a poor implementation could easily amount to several orders of magnitude. In one case, you get the answer almost instantly, and in the other your program might run for hours if not an entire day. Just imagine using repeated linear search over an array when the problem called for a hash table instead!

Overall, the programming assignments were a highlight of the course. Here’s a complete list, arranged by weeks:

  • Counting inversions
  • Quicksort
  • Karger’s algorithm
  • Computing strongly connected components
  • Dijkstra’s algorithm
  • 2 sum problem & Median maintenance

The first problem piggybacks on mergesort. The others were normally as straight-forward as they sound. However, the files that had to be processed were often quite large, which required some care when implementing the algorithms. Weeks 3 and 4 were the most challenging and also the most interesting problems. How well you’ll fare with the assignments may also depend on the language you chose. The grader only checks the numerical answer. How you get there is entirely your problem. You can chose any language you want, but some may be better suited than others.

The theory questions normally related to the homework in some way. I found it therefore helpful to only tackle them after I had successfully submitted the programming assignments. For some questions it may help to consult the optional text books. Prof. Roughgarden gives references for four textbooks, one of which is available for free online. The books were Cormen et al., Introduction to Algorithms, Dasgupta et al., Algorithms, Kleinberg and Tardos, Algorithm Design, and Sedgewick and Wayne, Algorithms. I found the free resource, Dasgupta, to be sufficient for the course.

In this offering, the problem sets and programming assignment counted 30 % each for the final grade, and the final exam made up the remaining 40 %. The final exam was fair, but not necessarily easy. However, it felt somewhat superfluous since the problem sets covered the material already, and some of the questions in the problem sets were more challenging. But partly this could have been because some topics were new for me back then, and when encountering them again in the final exam I was familiar with them already.

While I tremendously enjoyed the course, there were some problems, too, and most are related to the forums. Algorithms: Design and Analysis is not an introductory course. If you’ve gone through a solid CS101 course, and a survey course in data structures, maybe like Stanford’s free CS106A and CS106B, you should be well-prepared. If you lack this knowledge, then do yourself the favor and work on the basics first. However, quite a few people seemed to not have read the course description, according to which this course is appropriate for junior or senior level CS students. As a consequence, I mostly avoided the forums in the beginning because they were filled to the brim with basic or superfluous questions, and often with poor grammar and vocabulary. I couldn’t help but wonder why it almost always were people using Java who didn’t have a proper grasp of programming, the course content, and the English language. As the course progressed, fewer and fewer of those people were left, which increased the value the forums provided substantially.

Weak forum moderation is by far my biggest gripe with MOOCs. I think a sub-forum for beginners would have been a good idea. Forum moderators could then simply move all distracting threads there. An even better idea would have been some kind of entrance exam that checked basic competency. There certainly were some good discussions in the forum. Yet, the signal to noise ratio was at times as bad as on any “social media” site. In other words: the few good contributions were drowned by a myriad of barely legible or superfluous “me too” posts because people were too lazy to check whether a similar issue had been discussed already.

Speaking of the course material, one negative aspect was that no test cases for the programming assignments were provided. However, the text files we were given were normally so large that a manual check was not feasible. This was where the forum shined as some people posted test cases to work with. This was a good alternative to testing your algorithm on a 70mb text file. I’m sure many would have appreciated if the course staff had provided a number of test cases as this would have ensured a smoother experience.

Further, there was an issue that some people managed to implement algorithms that were able to process the small test cases, but which choked on larger files. When I took an algorithms course at university, we were given two or three input files per assignment. To pass the assignment it was sufficient if your algorithm could process the smallest input size. This was a fair approach since you showed that you could implement the algorithm. On the other hand, in Algorithms: Design and Analysis, your only choice was to correctly process the often very large input file. Therefore, I think that people were unfairly being punished for implementing an inefficient but principally correct algorithm. They are no better off than somebody who didn’t manage to implement the algorithm at all. But shouldn’t the former group at least have had a chance to earn partial credit?

While there were some minor issues with this course, I nonetheless think that Algorithms: Design and Analysis, Part 1 was great. Especially for an autodidact it’s probably a better solution to go through this course than any of the standard textbooks on your own. I highly recommend it, and I’m eagerly waiting for part 2.

Review: Introduction to Systematic Program Design – Part 1 — Coursera

Some years ago I tried reading How to Design Programs (HtDP) by Felleisen et al., which is an introductory computer science text book. It focuses on a special “design recipe” for designing programs and teaches programming in a Lisp-variant. The authors claim that students who start with HtDP and then proceed to study Java for one term tend to become better Java programmers than those who have been taught Java for two terms.

The premise of the book did sound tempting. Sadly, I found that the pace was way too slow to keep my attention. How to Design Programs was tedious to read because it was rather verbose, and, for my taste, there were too many comparatively insignificant exercises. What I also didn’t like was that the authors went out of their way to avoid mathematical examples, which occasionally led to programs that were more complex than they otherwise would have been. My memory is a bit foggy, but I vaguely remember an exercise in which you had to process a list that consisted of images, but the underlying principle could just as well have been illustrated through some basic mathematical operations on a list of integers.

I eventually put HtDP on the shelf and forgot about it, but I was reminded of it again when I learnt that Prof. Gregor Kiczales of the University of British Columbia was about to teach a course based on HtDP on Coursera, called Introduction to Systematic Program Design. I signed up because I still had an interest in learning the design recipe, and was curious whether it really helped turning you into a better programmer. The original announcement promised a course that would teach you the design recipe in a special “student language”, a variant of Lisp, and once you’ve made it through that sequence, you could pick a different language to tackle the same material again, and see how the design recipe would help you to become more productive in a possibly unfamiliar language more quickly.

Coursera logo of How to Design Programs

Coursera logo of How to Design Programs

Before I go on, I should probably talk about what the design recipe is. In short, it’s a systematic approach to designing programs. There are recipes for designing functions, data definitions, and even interactive programs that use Racket. Further recipes cover, for instance, function composition, backtracking search, or generative recursion. You shouldn’t think of those recipes as straightjackets but as guides. They may be unnecessary for simple programs, but with more complex ones, they allow to reason about the problem you want to solve in a more disciplined way.

Here is the recipe for designing functions, which I’ve taken from the course site:

;; Number -> Number
;; produces n times 2
(check-expect (double 0) (* 0 2))
(check-expect (double 1) (* 1 2))
(check-expect (double 3) (* 3 2))

;(define (double n) 0) ; this is the stub

;(define (double n)    ; this is the template
;  (... n))

(define (double n)
  (* n 2))

The first line describes the involved data types. To the left are the types that constitute the input, to the right the types that will be returned by the function. The next line is the purpose, which informs the reader about the function in English. The “check-expects” are tests for the function, but before you write those, you are supposed to write the stub. With the stub, you can then run the tests and check whether they were well-formed.

The templates were recommended for beginners, and were dropped in the latter half of the course. As the name implies, they give you a template for the actual function you’re going to write. Now, in the given example, this may look a bit silly, but when you’re working on a more complex function that requires, for instance, generative recursion, the templates are quite useful. There is a bit more to the design recipe as it extends to data definitions and recommendations for function decomposition, but their general outline should now be clear.

Introduction to Systematic Program Design was originally planned as a single course, spanning 14 weeks, if I recall correctly. However, this was changed and the course split into two parts. A starting date for part 2 has not been announced yet. However, Prof. Gregor Kiczales wrote that they might offer part I again first, aimed at students at the University of British Columbia first, before hosting part II on Coursera.

The course itself targets both beginners and experienced programmers. According to Prof. Kiczales, more experienced programmers who took his course said that it helped them improve their craft. I’m in the latter camp, and I subscribe to that notion. Introduction to Systematic Program Design took me a few dozen hours to complete, but I’ve probably saved half of that amount time on various programming projects already thanks to following the design recipe at times.

The course takes eight weeks and consists of eight units. There are eight weekly quizzes, two programming projects, and one final exam. To pass the course and earn a certificate, you have to have a score of 50% for a regular certificate of accomplishment, and if you want to earn a certificate with distinction, you need 80%. Homework counts 40% of the grade, the projects 30%, and the final exam 30%. I’ll say more about grading and the projects further down.

The first few weeks of the course are relatively slow-paced. If you’ve got no experience with programming, you should pay close attention and only proceed if you really understand all the concepts that are taught. The videos offer excellent support as they not only cover the necessary theoretical foundations, but also walk you through many examples. Experienced programmers can go through the first three weeks probably rather quickly. They cover data types, function composition, and how to model data. However, the course uses a Lisp-variant as its teaching language. If you come from an imperative background, you may find some of the assignments challenging. However, each week numerous practice exercises in various degrees of difficulty were provided that enable you to smoothly work your way up from the lecture material to the homework problems.

Week 4 introduced interactive programs. I was quite surprised how easy it was to animate some geometric shapes on screen. As the programs became more complex, I was pleased by the user-friendliness of the DrRacket environment, which was a real joy to use. In week 4 there also was a noticeable increase in difficulty. It should probably be pointed out that some of the practice exercises tended to be more challenging than the homework problems, which were assessed in the quizzes. One of the exercises had you code up a rolling lambda symbol, the DrRacket logo. It was supposed to change direction when you click the mouse, or when it hit the left or right border of the screen. I’m not sure the following still image can convey it clearly, but I hope it’ll do:

Rolling lambda, moving from right to left

Rolling lambda, moving from right to left

This problem involved rotation in positive and negative direction, translation on the x-axis, collision detection, as well as user interaction due to capturing mouse inputs. I noticed that there were many complaints in the forum about that exercise. I then checked my solution with the one that was provided, and could see what the issue was: the solution the course staff provided was tersely written and needlessly complicated, and therefore difficult to follow.

However, I was pleasantly surprised that the people involved with the course reacted quickly to the, in my opinion, justified complaints and revised their sample solution. Some days later an additional practice problem was added that slightly simplified the original problem. I found this attitude most impressive. From university I am used to a “sink or swim” approach, which is a stance you can encounter in some online courses as well. For instance, in Martin Odersky’s Functional Programming Principles in Scala there was an enormous discrepancy between the material in the lectures and the exercises, but the TAs thought this was no problem at all and only reiterated that this course was basically identical to the one at EPFL, and that they had no intention of making the material more accessible. What the TAs didn’t tell you, or didn’t want to draw attention to, though, was that at universities you don’t just have lectures and hand-ins, but also tutorial sessions or “labs” and office-hours. In one of the announcements Prof. Kiczales specifically made the point that the course, when taught at UBC, has a weekly three-hour lab session, which obviously can’t be replicated online, and that as a result, the online offering may be more challenging. However, instead of UBC only putting their course material online, they did their best to provide a supportive infrastructure, without watering down the course.

That this course wasn’t “dumbed down” became clear in the second half, which covered recursion and mutually referential types, eventually leading to generative recursion. Week 8 had, fittingly, some of the most challenging exercises which asked you to write programs that generate various fractals. The final project was an application of the backtracking search that was introduced in the lectures that discussed how to solve Sudoku puzzles. The variation was to write a solver for the 4-queens problem, with the option to generalize the solution to the n-queens problem. By week 8 I saw the value of the “design recipe”. While it looks like overkill in the first few weeks, which is a point Prof. Kiczales acknowledged himself repeatedly, it does make your life easier if you stick to it.

Judging from the discussions in the forum, a vocal minority seemed either unprepared for a programming course or unwilling to forget about “object-oriented programming”, adapt the mindset of a beginner and just learn the basics of Lisp. There were threads with literally hundreds of replies in which “seasoned software architects” tried to argue that Lisp was the wrong choice for the course and that it should be taught in Java or Python instead. This was quite tragic to see since Lisp and especially the “student language” we used can be described very concisely. I wonder what would have happened if those people just had sat down with an open mind and tried to learn the few language rules of Lisp instead of making a post about how “useless” and “weird” that language was.

Lastly, a word about grading: I had the impression that there was a dramatic rate of attrition throughout the course. While the forum seemed lively during the first few weeks, activity slowed down noticeably as the course went on. Yet, I would assume that concepts like “closure” or the traversal of arbitrary-arity trees would generate more questions than, say, the problem of concatenating two strings or changing the color of a rectangle. I’d be interested to see some retention statistics. In the latter weeks, it seemed that almost exclusively people with significant previous experience were populating the forums. One guy shared personal anecdotes from meeting John McCarthy, the creator of Lisp, for instance.

There were quite a few discussions in which Prof. Kiczales contributed, too. One thread I particularly enjoyed was devoted to optimization strategies for the Sudoku solver. While I avoided the forums in the first half of the course, I saw them becoming increasingly valuable in the second half. However, if my perception is correct, then the other side of the coin is that courses like this only increase the “digital divide”. It seemed that one of the promises of MOOCs was to bring higher education to people who would otherwise not have access to it, but I have the impression that it is mostly those who are advantaged already who make use of them. This is a more philosophical issue, but one that shouldn’t be ignored. I noticed that Udacity made great strides in offering preparatory courses that try bridge the gap between high school and university-level courses, so there is hope that MOOCs may one day indeed educate the masses and not just further the education of an intellectually curious minority. Making college-level courses more accessible by providing preparatory introductory and remedial courses is a challenge for Coursera and not necessarily UBC, though, so this is in no way intended to be a criticism of Introduction to Systematic Program Design.

After this diversion, let me now make some remarks about the grading. The homework problems are not automatically graded but evaluated through quizzes. Probably peer-review may be a better solution. In their current form the homework problems may be difficult to grade automatically since the majority of the exercises was such that there was more than one possible solution. I’m not sure a satisfying solution exists for this problem. For instance, MIT’s Introduction to Computer Science and Programming on EdX split up the more elaborate assignments and had students submit their solution in multiple parts. This was a tedious process, and not only because there were plenty of instances were the autograder expected a certain approach instead while rejecting a perfectly valid alternative one.

Speaking of projects, the first project was to write a simple text editor similar to what is found on old mobile phones:

Screenshot of the editor

Screenshot of the editor

This is a neat little interactive application. The program rejects invalid input, the cursor can be freely moved around to change the insertion point, and the backspace key deletes the character to the left of the cursor if there is any.

The second project was, as I said above, a solver for the 4 queens problem, based on a backtracking search. This was a very enjoyable exercise too, and one where it quickly paid off to follow the design recipe. There was a change in grading this last exercise, though. I suspect it had at least to some degree to do with the fact that the rate of attrition has, or at least seemed to be, very high. The deadline for the project was pushed back by four week to give copious amounts of time to finish the course material and the project. However, the final project was self-assessed, meaning that in theory you could just give yourself the highest score without doing the any work and thus get 15% of the course grade for free. I found this change of the course policy unsatisfactory. On the other hand, most probably enrolled in this course to learn something and not just to weasel their way to a PDF certificate.

The final exam was worth 30%. It was designed as the equivalent to a 24-hour take-home exam. It covered material drawn from all eight weeks of the course. The exam was not particularly challenging, but this may just be testament to the high quality of the teaching. Overall, Introduction to Systematic Program Design – Part 1 was a fantastic course, and I’m eagerly awaiting the second part.

Addendum:
After I wrote this review, Prof. Kiczales announced that Introduction to Systematic Program Design – Part 1 will be offered again on the 4th of September. Changes include an increase in the duration of the course form eight to ten weeks to even out the workload, the addition of a third project, and a change in the assessment of the homework problems.

No, Java is not a good first programming language

Recently someone reached this site through the search phrase, “is java a good first language 2013”, so I felt tempted to answer this question from my own highly subjective point of view. The short answer is no, it’s not. In fact, Java is one of the worst first languages you could pick. In the following I’ll highlight two main issues. One issue, the complexity of Java, will make life more difficult for you immediately, and the other, the danger of developing myopia regarding programming languages and their capabilities, has the potential to hurt you for many years to come and possibly your entire career.

For a beginner it can be exhilarating to make a computer bend to your will. However, the path to reaching even a low level of proficiency is paved with copious amounts of frustration. Writing code demands a level of attention to detail and a degree of logical thinking that is above what the average person could muster. Gaping holes in your logic are par for the course in, say, mainstream media or normal conversations, but a computer is not so forgiving. Just learning to proceed in a clear, logical way is difficult enough for a novice. But not only do you have to learn that. In addition, Java and other “industry-strength” languages burden you with their complexity and will therefore make it more difficult for you to learn programming.

One concept you’ll learn in CS101 is what computer scientists call “types” or “data types”. The declared type indicates how a data has to be interpreted, and determines how it can be manipulated. For instance, an integer with the value 42 is not the same as a string with the value “42”, even though they look the same when printed to the screen. You can multiply an integer by 2, but not a string. On the other hand, you can append another string to a string. All of this is probably not overly exciting. In Java, you have to declare variables with their type. In a more modern language like Python, which has been around for over two decades, you simply initialize the variable since it is dynamically typed. Here is an example if this sounds confusing. [Note: Static typing isn’t necessarily bad. However, the type system of Java doesn’t buy you much, unlike the type inference in Standard ML and other functional languages, but if this is an objection you wanted to raise, then you aren’t a beginner and shouldn’t consider yourself part of the target audience of this article.]

Let’s say you’ve got a list of integers and want to double each entry of the list. If you want to use Java for that, you’ll end up writing a nice little piece of code like this:

package demo;

public class RangeDemo {

	public static void main(String[] args) {
		
		int[] myArray = { 1, 2, 3, 4, 5 };
		
		for (int i = 0; i < myArray.length; i++  ) {
			myArray[i] *= 2;
		}
		
		System.out.println(myArray);
	}
}

If you’ve never seen code before in your life, this might look quite intimidating. You’ve got all this scaffolding, and plenty of keywords which represent concepts you have not encountered yet. You know nothing about packages, or classes. The syntax is baroque. What are all those curly braces about? What are the semicolons good for?

In fact, for a beginner writing something as simple as those few lines of code can pose a significant challenge in Java. This is not because the operations that are involved are so difficult to grasp. It’s basic arithmetic, after all. Instead, the reason is that Java throws so much unnecessary complexity at you. By the way, I’ve hidden an easter egg in that code. Try to run the example for yourself, and you’ll see what I mean. Come on, try it! I’ll wait for you.

Strange, isn’t it? The console gave you an output like “[I@4b71bbc9”. This is the location of the array in memory. To actually print the content of the array, you have to import the Array class, and then convert the array to a string representation:

package demo;

import java.util.Arrays;

public class RangeDemo {

	public static void main(String[] args) {
		
		int[] myArray = { 1, 2, 3, 4, 5 };
		
		for (int i = 0; i < myArray.length; i++  ) {
			myArray[i] *= 2;
		}
		
		System.out.println(Arrays.toString(myArray));
	}
}

As you see, this is a lot of code to do something very simple. Sadly, this approach scales. While your first programs will be a few dozen lines that should be just one or two, your larger programs will be several hundred lines instead of a few dozen, and in even more unfortunate cases several thousand instead of a few hundred. At this point, you’ll have heard of “patterns”, which were developed as a consequence to the deficiencies of the language and are largely superfluous in more advanced languages. [Peter Norvig discusses this problem in greater detail in Design Patterns in Dynamic Languages, but it won’t make much sense for a beginner. Maybe bookmark the link and come back after some months or a year.]

There is also the notion of “boilerplate code”, which is a problem in Java and other equally verbose programming languages. You might think that printing “hello world” should be one or two lines. Not in Java. Or you might think that a common operation like opening a file should be a line or two and consist of specifying the file name and the mode, i.e. whether the file is supposed to be only read or read and written to. In Java this isn’t quite so simple. Here is an example I’ve taken from Hackerrank:

import java.io.*;
import java.lang.*;

public class Solution
{
    public static void main( String[] args )
    {
        File fileName = new File( "myfile.txt" );
        if( !fileName.exists() )
        {
            System.out.println( "this file doesn't exist " );
            try
            {
                fileName.createNewFile();
                FileWriter fileWrite = new FileWriter( fileName );
                BufferedWriter bufferedWriter = new BufferedWriter( fileWrite );
                //bufferedWriter.write( "write something here " );
                bufferedWriter.close();
            } catch ( IOException e )
            {
                //catch exception
            }
        }
        else
        {
            //System.out.println( "this file exists " );
            try
            {
                byte[] buffer = new byte[ 100 ];
                FileInputStream inputStream  = new FileInputStream( fileName );
                int readLines = -1;
                while( ( readLines = inputStream.read( buffer ) ) != -1 )
                {
                    //System.out.println( new String( buffer ) );
                }
                inputStream.close();
            } catch ( IOException e )
            {
                //catch exception
            }
        }
    }
}

Yes, all this just to open and read a file! Also note that the use of the “BufferedWriter” in the first try block constitutes what I’ve once been told was a “pattern”. Now if you look at the try block in the else statement, you may wonder why there is no analogue like a “BufferedReader”. In this example, the file is read as an input stream. Last time I checked, the recommendation was to use “BufferedReader” since it provided better performance when reading large files, but this is nothing you have to worry about in CS101. But let’s not get lost in details. The point I’m trying to make is that code like the one above is, well, utter madness.

By the way, this is the equivalent in Python:

#!/usr/bin/python

filename = "myfile.txt"
with open( filename ) as f:
    # file read can happen here
    # print "file exists"
    print f.readlines()

with open( filename, "w") as f:
    # print "file write happening here"
    f.write("write something here ")

Again, the example has been taken from Hackerrank. The pound sign (#) indicates a comment and is not necessarily for program execution. Thus, you’re looking at a total of five relevant lines of source code. Yes, five.

Those were just two random examples, but they hopefully illustrate why choosing Java as a first language is a rather bad idea. I’m proficient to various degrees in about half a dozen languages. However, I don’t think I’ve ever found myself in a situation where I was writing code in, say, Python or Erlang and said to myself, “I wish I could do this in Java instead.” When I have to work in Java, though, I tend to feel reminded of the increased expressiveness of other programming languages.

Instead of Java as a first language, I’d recommend you learn Python instead. Python code can be very clean and readable, and dynamic typing makes it more fun, too. I haven’t given you the equivalent of the very first Java example yet, which was doubling the numbers of an array of integers. The Java code at the beginning of this article can be expressed in Python like this:

my_array = [1, 2, 3, 4, 5]
my_array = [ number * 2 for number in my_array ]
print my_array

If this is too verbose, you can shorten it a bit:

print [ number * 2 for number in [1, 2, 3, 4, 5] ]

This is what people mean when they say that one line of Python can replace ten lines of Java. As a beginner, you’ll probably find this code easy to understand, even if you’ve never seen code in your life.

Apart from all the unnecessary complexity Java throws at you, there is another aspect I’d like to highlight: programming language myopia. Many people find it very difficult to liberate themselves from the mental models and habits they have acquired in their first programming course. Instead of adopting a beginner mindset again when picking up a different language and learning what it offers, they view everything from the one perspective they know, and happily write Java in Scala or Java in Python or Java in any other language that lets them get away with it. They pronounce that differences between programming languages are “just syntax” and never realize that they don’t make use of more advanced features of the languages they use.

It’s too easy to be complacent when your code seemingly works. Your overly verbose Java in Scala does the job, so why would you bother stretching yourself a little bit? This attitude is common, and you’ll have no problem bumping into people who reject using more powerful languages because of their “weird syntax”. For more on this, read what Paul Graham has to say about Blub programmers. It may come in handy at a nerdy cocktail party. Variations of what Paul Graham calls the Blub Paradox can happen at all levels of competency. People think what they are doing is good enough and don’t make an additional effort. However, it is a missed opportunity if you view a more expressive language from the view of a less expressive one and never familiarize yourself with the possibilities more powerful languages offer.

For a possibly surprising example, let me tell you about one of the MOOCs I sampled earlier this year, MIT’s Introduction to Computer Science and Programming. There are many conveniences in Python, for instance an immediate variable swap. Swapping the content of the variables a and b requires nothing more than writing “a, b = b, a”. On the other hand, in Java and many other programming languages you would have to use a temporary variable and express the same in three lines. Of course, you can write the same three lines in Python and use the temporary variable you’re used to. It works, but it’s considerably less elegant. This is a relatively harmless example. More severe was the introduction to object-oriented programming in that course was taught from a clearly Java-inspired perspective. In Java it is necessary to write so-called getter and setter methods, sometimes called accessor and mutator methods. Yet, this is superfluous in Python, where you have direct access to attributes you want to access or modify. Now, if this happened in a CS course that was taught by MIT, then it’s probably safe to assume that a college freshman who is starting out with Java isn’t immune from that kind of programming language myopia either, so do yourself a favor and learn one or two other programming languages first.