Building a Budget PC

Introduction

I’ve been using a laptop as my main and only machine for many years now. My current laptop, a trusty aluminum Macbook, turned five recently, and is still going strong. Yet, I thought it was nice to have a backup machine at hand, for the sake of convenience, and in case my laptop breaks down, which is only becoming more likely.

I ended up with a reasonably powerful machine that cost me around 240 GBP, which was much less than I initially anticipated. The last time I built a computer was in early 2000. The most expensive component back then was the NVidia GeForce 256 graphics card. Taking inflation into account, the PC I built was cheaper than the GeForce 256.

Usage Scenario

I’m not a PC gamer, and therefore I wasn’t concerned with building a gaming machine, even though I will point out a relatively cheap upgrade later on that would turn this build into a decent mid-range gaming machine. Instead, my intention was to build a Linux workstation. I spend most of my time in gedit and Emacs, depending on the programming language, and Firefox. Furthermore, the little browsing I do is rather text-heavy. Probably the most taxing thing I did on my laptop this year was running algorithms I had yet to optimize for Tim Roughgarden’s Design and Analysis of Algorithms online course.

My goal was to set up a Xubuntu environment. Xfce is lightweight and fast, and Ubuntu enjoys great support. I’m currently running Xubuntu 13.10 without any problems. The next version is Xubuntu 14.04 LTS, which I intend to upgrade to, and then keep using until the next LTS release is out.

Components

With the preliminaries out of the way, I’ll now discuss the parts I bought. Here is a picture of all the boxes:
All components

I sourced all components from Amazon.co.uk. They offer free delivery, and if there ever is anything wrong, their customer service is great to deal with. Unfortunately, I’ve occasionanlly had a less than stellar experience with some of the third-party sellers on Amazon, so I prefer buying directly from Amazon, even if some other seller offers the same product for a few pounds less. The headache is just not worth it.

These are all components with prices as of 2 December 2013:

– Case: Fractal Design Core 1000 Micro ATX (26 GBP)
– PSU: Corsair CXM 430W Modular (38 GBP)
– Mainboard: ASRock H61M-VG3 (34 GBP)
– CPU: Intel Celeron G1620 (2.7 GHz) (34 GBP)
– RAM: Kingston HyperX BLU 4GB 1600MHz DDR3 (32 GBP)
– HDD: Samsung 840 EVO 120GB 2.5 (74 GBP)

In total, the cost was slightly below 240 GBP, which strikes me as astoundingly cheap. Some of the components are maybe a bit extravagant for a budget PC, so if you’d chose a basic PSU and a regular HDD, you could shave off 50 GBP or so, which would result in a highly affordable machine.

You might wonder why I didn’t buy a graphics card. Well, the Intel CPU comes with “Intel HD Graphics”, making a dedicated graphics card for non-gamers superfluous. The internal Intel GPU is probably sufficient for casual gamers. I’d be very surprised if Minecraft, or some older games like Team Fortress didn’t run perfectly fluidly. I won’t make a guess about the performance of Crysis, though. On a side note, the Celeron G1620 is able to power up to three screens — but not with the budget mainboard I got.

The CPU

When Intel released the first Celeron chips well over a decade ago, they had a bad reputation due to their poor performance. The current processors in this series, though, are basically slightly crippled i3 processors, and therefore they are reasonably powerful. That’s certainly true under Linux, but it may be different under Windows or Mac OS X, if you’re thinking about building a ‘Hackintosh’. I was certain that even the weakest Intel CPU would be good enough, based on my experience running Linux on a machine with 256 MB for years. In fact, that machine was more responsive than any Apple or Windows machine I ever used.

Granted, I don’t really need a lot of CPU power. I’d say if you don’t play games, or dabble in multimedia, then spending money on a more powerful CPU is a waste of money. Frankly, for any non-gamer a ten-year old machine with Linux and 1 GB of RAM would suffice. We’ve long reached the point where hardware is really fast enough for typical usage scenarios.

The fact that hardware is fast enough is one of the better kept secrets in the technology press. Sure, new hardware is more power efficient, but if you can resist the Microsoft marketing machine and go for Linux instead, you will find very little reason to upgrade your machine at all. There was an interesting article I came across on Tom’s Hardware: Is This Even Fair? Budget Ivy Bridge Takes On Core 2 Duo And Quad.

Here is an excerpt from p. 20 of that article:

Was anyone surprised by the potency of the little Celeron G1610? It beat out the Core 2 Duo E8400 by 8% in both games and applications, placing it directly between the Core 2 Duo E8500 and E8600! (…)

Equally impressive is how well the Pentium G2020 performs, outpacing the E8400 by 21%, despite a 100 MHz-lower clock rate. That’s a solid showing for a $50-65 chip, even if the lack of overclocking support means you’ll never get anything extra out of it.

In short, the cheapest CPUs Intel sells today can compete with high-end CPUs from some years ago.

There is more than just the low price that’s an advantage of that CPU. Power consumption is also very low, so the stock fan Intel puts in the box is sufficient. Long gone are the days were your PC would heat the room.

The Rest

While the CPU might be underwhelming, the PSU probably seems a bit extravagant. The reason for picking that particular Corsair model was that it is modular, which means that you can remove all power supply cables you don’t need. This leads to the innards of the PC looking nice and tidy, as the following picture of the almost fully assembled PC shows:

The innards

The mainboard is once again a product taken from the very low end. It offers two USB 3.0 slots, but only supports SATA II. Further, there are only two RAM slots available. I figured that this should still be plenty. I’m using only one RAM slot, which holds a 4 GB Kingston module. While Apple OS X users and Windows afficionados may now shake their heads in light of my supposed ignorance, it really is the case that a typical Linux installation is very resource-friendly. It seems that OS X requires over 2 GB of RAM just for itself, and once you open a few applications, you’re using four or five gigs of RAM. In comparision, after booting, memory consumption of my Linux setup, using Xubuntu, is at around 220 MB. We’re talking about a staggering difference of one order of magnitude.

Keeping a careful eye on the resource monitor, I’m pleased to say that I have yet to use more than 2 GB RAM when doing work comparable to what I do on OS X. RAM consumption normally hovers around 1 GB. I wish OS X was that resource-friendly! By the way, a downside of the integrated GPU in the Intel CPU is that it reserves 256 MB RAM for itself. I thought that this might lead to problems, but Xubuntu uses so little memory that the 4 GB, minus the fraction that is reserved for video, seems rather excessive.

The most expensive component is the SSD. Sure, you could save quite a bit of money by choosing a regular HDD, or get a 2 TB drive for a comparable price. I have not use for that much space, but on the other hand I do value the advantages of an SSD. It just makes working on a computer so much more enjoyable. After you’ve gotten used to your system booting as well as shutting down in 15 seconds or less, and applications launching anywhere between instantly and a few seconds, it’s hard to go back to a regular HDD.

Performance

The last paragraph leads to discussing my subjective experiences. After putting all parts together, I installed Xubuntu via an USB stick, in case you noticed the absence of a DVD or BD drive. Within a few minutes I had a fully functional operating system. After some more minutes I had installed all the other programs I use. This was more convenient than installing OS X or Windows, and not just because it took a lot less time.

This PC won’t impress with its benchmark results, but let me just take a page out of the Apple PR handbook and state that while the benchmarks may not be great, the entire system nonetheless “feels” fast and highly responsive. Heck, I’m quite certain it feels much more responsive than a Mac Mini that costs three times as much. As an illustration, my screen takes a couple of seconds to warm-up from standby, and when it’s ready, the first thing I see is the login screen.

Noise is pretty much a non-issue. My Xbox 360 Slim, by no means a loud machine, is noisier. The PC is humming along nicely. Amusingly enough, if the fans in my Apple laptop spin up, they drown the noise of the PC. By the way, the case comes with a fan in the front, which could probably be disabled, at least in my setup.

Conclusion

So, there you have it: a quality PC for less than 240 GBP. For anyone wanting to use a computer not primarily as an entertainment device, the PC I built would be wholly sufficient. If you want to play games, you can throw in an AMD Radeon HD7750 for 70 GBP. This card is even available in a passive version with a heat sink instead of a fan. With this upgrade you’d have a decent mid-range gaming system, presumably blowing the PlayStation 3 or Xbox 360 out of the water. If cost is an issue, pick a cheaper hard drive and PSU, and you’ll get a gaming PC with a decent graphics card at a great price.

Noppoo.eu looks like a Scam to me

I was recently looking into buying a mechanical keyboard since I was growing rather tired of Apple rubber dome products. After plenty of research I ended up with a Filco Majestouch-2 and a Noppoo Choc Mini 84 on my shortlist. They both received good reviews from enthusiasts, but I then couldn’t see myself spending 120 pound sterling on a Filco, and the Noppoo Choc Mini 84 seemed a bit cramped. It was still quite pricey at 84 pounds on Amazon.co.uk, though. Then I learnt about the Noppoo Choc Mid 87, which has a somewhat more comfortable layout but still no numeric keypad.

Still, all things considered it seemed like a good compromise, since I was interested in a mechanical keyboard with blue Cherry switches. I don’t play games on my computer, so the many available mechanical keyboards with black “gaming switches” that lack the actuation point of blue switches were not attractive to me. So, the Noppoo Choc Mid 87 it would be. Looking around for a good offer, I eventually stumbled upon Noppoo.eu, which looked legit enough at first sight. Their pricing was very attractive to, selling all keyboards for EUR 89.99, which is about 75 pounds. They offer free shipping, too, and they claim to ship from within the European Union, so it should have taken a few days to receive the product.

I happily picked a Noppoo Choc Mid 87 with blue switches, and placed my order. The shop seemed a bit amateurish, but that keyboard is only sold by third-parties on Amazon.co.uk, and I’ve had some pretty bad experiences with those, too, so paying more for a possibly even worse experience wasn’t such an enticing option either. Fortunately, Noppoo.eu accepts payment via credit card, so I was aware that I could always just cancel the payment in case they tried screwing me over. Had they used Paypal, I would have avoided them altogether, though.

The red flags came in rapid succession after my order, though. First, the order confirmation didn’t come from an “@noppoo.eu” email address, but instead from “[email protected]”, and I was told that I could contact them at that address in case I had any questions about my order. So, Noppoo.eu was just a front for Mister Deal, whatever that is. Looking up the address, it turned out they sell trinkets online:

So, it’s a trinket store with a side business selling high-quality keyboards?

Very shortly after I had placed my order, they took money off my credit card, and the transaction was done by “Florent Pialot T/A”. The email referenced two URLs, one of which being noppoo.eu, and the other kbtpure.com. That’s what their website looks like:

A bad omen.

This didn’t look good at all. At that point I made a note in my calendar to review my VISA statements in two weeks. Then things took a rather surprising turn, and I found myself in the middle of the typical scam playbook:

Who knows, maybe he thinks I’d be interested in buying properties on the moon, too?

I placed my order at the beginning of November. Does this ring a bell already?

Okay, let me tell you what a reputable seller does: They keep the stock information on their website up-to-date, and they take your money not when you order but the moment they ship the product to you. Scammers on ebay do the following: take your money, often via PayPal, and then tell you of “unexpected difficulties”, which are of course expected to be resolved within six weeks or two months. As it just so happens, after that much time has passed, you can’t dispute your PayPal transactions any more, which means that they now have your money, while you’ll have no way to get it back. PayPal will then refer to their terms and conditions, if they respond to you at all. You could of course take legal action, but good luck chasing after some scammer in a foreign country.

I was not at all interested in getting scammed by some guy allegedly residing in Hong Kong, so I asked them to refund my money. Note that they never offered this option themselves. Instead, their proposition amounted to basically saying, “Hey, mind if we keep your money? We’ll send you that product in about two months, but then it’s Christmas and New Years, so it might take even longer. But don’t worry, dude, you can trust us!”

After about a week I got my money back, but they just had flip me the bird by refunding one cent less than I paid, and when questioned, claimed it “was not on purpose” and that I should contact their payment processor. This doesn’t make any sense since the company handling payments only transfers the amount of money it was instructed to transfer.

Let me digress for a moment. Those things never happen “by accident”. At the very least, they are the result of structural stupidity, i.e. of someone manually refunding money instead of referencing a particular payment. This is by design as well. Further, I certainly never ever had it happen that a company accidentally refunded me more money than I was owed. It was always the opposite. There are even “reputable” institutions like banks who unfortunately experience some problems with their systems and charge you fees for free services. If you question them on that, they throw their hand up in the air and claim it was all a computer mistake. This must be the most convenient excuse of all time. It’s the business equivalent of the high school student claiming that his dog ate his homework. To add insult to injury, you then have to jump through hoops to get your money back, and you may even encounter people playing dumb, hoping that you’ll give up. I guess some businesses really do hire morons for that task, though.

In the case of banks it’s of course part of the business model to try charging all, or at the very least a large subset of their customers, unjustified fees, and hoping that not enough of them will dispute the charges. Hey, what’s EUR 15 once every three months, or EUR 20 once a year for a report you never ordered? It’s a ton of free money if you multiply it by their number of customers.

I have no patience with incompetent and dishonest businesses, and I certainly have no sympathy for companies that act in downright fraudulent ways. In that regard, it is quite amazing how cooperative third-party sellers become if you tell them that you’ll inform Amazon about their business practices. For instance, I had it happen more than once that I was sent a re-sealed used video game when I had ordered a new one, and there were scratches as well as dirt underneath the shrink-wrap.

In the case of Noppoo.eu, disputing the credit card charges would, if enough duped customers did the same, increase their costs of operation since VISA would raise the rate they charge them for processing payments, and, if there are too many chargebacks, eventually just close their account. My bank would have needed a form for this in writing, and sent via snail mail. I then thought that it has been a while since I’ve written a blog post, so I thought that my time was better spent voicing my discontent online, and warn other customer of Noppoo.eu.

Please note that I am not making the claim that the people behind Noppoo.eu — it could well be just a one-man operation, though — are scammers. I could have done that had they really scammed me. However, think of the duck test: “If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.” To me, Noppoo.eu shows all signs of a scam operation. Of course I could just have waited for two months and then I might have received my keyboard. Maybe I wouldn’t. If you’re about to take a shortcut through a dark alley and notice two suspicious characters, one of which is wielding a baseball bat, you might not want to try your luck either.

Lastly, let me point out that I find it highly dubious that Noppoo.eu has not changed the stock status of any of their products on their website. If the information in their email is correct and their keyboards are sold out, then why on Earth didn’t they update their website to reflect this? I checked their site after I got their email, and, out of curiosity, for seven workdays in a row, until I decided that they deserve some public shaming for their actions. As I’m typing this they still list all keyboard models in all colors and with all switches as “in stock”:

Order now if you feel like gambling.

Feel like a scam, looks like a scam, acts like a scam. Probably a scam. But feel free to risk 90 Euros for absolute certainty.

Replicating a BSc in Computer Science through MOOCs

Introduction

Two years into the MOOC revolution we’re now at a stage where the content of entire degree-programs can be found online. Of course, going through MOOCs won’t get you any paper certificates, even though you can certainly pay for virtual proofs of your identity if you are so inclined. Coursera offers a “Signature Track”, and EdX recently added a similar option to a number of courses. You can acquire the knowledge, however.

I’m not suggesting that MOOCs can be a replacement for the entire college experience, though. You’ll miss out on a few aspects, such as interacting with your professors, but in any large university you won’t interact much with your professors either. Further, there is no equivalent of writing a thesis, or getting involved with research. Apart from those limitations MOOCs are a dream come true for an autodidact.

I was wondering whether it was possible to follow a typical CS curriculum solely with (free) online courses, and it turned out that it is largely possible. There is also a paid-for option through the University of London, but their Computing and Information Systems BSc, taught by Goldsmiths, comes with a sticker price of close to GBP 5,000. So, let’s say you don’t want to spend that much money, or much more on a traditional college degree in the UK or US. How far could you get?

Please note that I won’t bother with any “general education” requirements. In Europe 3-year BSc programs are the norm, and you primarily focus on your major subject. In the US, though, you’ve got to endure roughly a year’s worth of courses that are entirely unrelated to the subject you’re actually studying. They are proclaimed to turn you into a well-round person. This is pretty much just marketing speak. What most commonly happens is that people pick courses where they hope to get an easy “A” in, and most students don’t take them seriously anyway. A case in point is the Harvard Cheating scandal of 2012. Oh, and I certainly would feel ripped-off if I were forced to pay several thousand dollars for the doubtful privilege of taking a survey course on popular music. This isn’t an essay on the shady sides of US higher education, though, so I won’t dwell on that topic any longer.

Computer science historically grew out of either engineering or mathematics, and therefore there are differences between programs. Some have a stronger focus on CS theory and mathematics, others include many courses related to electrical engineering in their curriculum. I’ll instead focus on what I consider to be “core CS” courses, and I will highlight some specializations that are well-represented online. CS is not applied IT, though, so anything in the vein of Codecademy doesn’t quite qualify for this article. Some typical courses you won’t find on Coursera et al. yet. However, I’ll point to alternative resources in that case.

Introductory Programming and CS

There is a wealth of introductory programming courses. I think there are benefits in beginning with a functional programming language, which entails a much reduced level of artificial complexity, and the fact that it’s much easier to reason about programs without mutation. A very good first course would therefore be Introduction to Systematic Program Design – Part 1 by Gregor Kiczales, which uses the Lisp dialect Racket. Part II is planned. Those courses are based on the classic introductory CS textbook How to Design Programs by Felleisen et al. I don’t like that book due to its slow pacing. Prof. Kiczales’ course is much more digestible, though.

You’ll probably want to pick up a more mainstream language such as Python or Java. For Python, I would have recommended Udacity’s CS 101 a year ago. That course used Python. A problem with the Udacity platform is that the forums are a wasteland. I don’t like some of their forced attempts at involving students, either. For instance, in CS 101 they asked students to submit exercise questions themselves, and probably in a misguided attempt to be “inclusive” they put exercises online that show very poor style, such as printing a value instead of returning it. Some other student’s exercise, which is likewise part of a set of optional problems for CS101, has a bug I reported over half a year ago in the forum. There has been no response to it at all. However, there are now about a dozen threads of students who are confused by the original problem because their (correct) solution does not take that particular bug into account.

Udacity also offers an introductory Programming in Java course, which I haven’t gone through. It’s probably okay. If you can motivate yourself, I’d recommend Stanford’s CS106A: Programming Methodology for self-study. Mehran Sahami is a fabulous lecturer, and his course is very thorough. I taught myself Java with it, and Allen Downey’s free textbook Think Java.

If Udacity’s CS 101 is not so great, then what’s the alternative if you want to learn Python? I think it’s Rice University’s An Introduction to Interactive Programming in Python. You’ll build increasingly complex games, and by the end of the class you’ll have written between 1,000 and 2,000 lines of code, which will get you a lot of practice. It’s an entertaining class, which I’d recommend even for guys with some years of experience, particularly if you’ve never built simple games.

Those courses will teach you enough programming skills. They should be followed with a course on datastructures and algorithms, i.e. a traditional second course in CS. Unfortunately, the most thorough treatment of that topic is taught in Java: Algorithms (Part I, Part II) by Kevin Wayne and Robert Sedgewick. It would be preferable if an algorithms course was taught in either a language-agnostic way, or in a more expressive language. Tim Roughgarden’s excellent Algorithms: Design and Analysis (Part 1, Part 2) is language-agnostic. This course also includes a review of important data structures. However, to get the most out of these courses you’ll need some mathematical maturity.

Mathematics

It seems that there is relatively little use for continuous mathematics within CS. Calculus is nonetheless commonly taught, which is arguably due to historic reasons. I don’t think you could make a good utilitarian argument for studying calculus within CS. However, you could easily argue that a certain degree of mathematical maturity makes you a lot smarter. You’ll certainly be less impressed by mainstream media information if you know a bit of statistics and probability.

If you didn’t take calculus in high school, then I’d recommend two fun introductory classes from Ohio State University: Calculus One and Calculus Two. Calculus One is an introductory course on single-variable calculus. The presentation of the material is a bit different from what you might be used to in mathematics, but I don’t mind the popularization of the material. For instance, I had no idea what a grain elevator was when I first encountered that term in calculus problems, so I appreciate that Jim Fowler and Bart Snapp use examples you can more easily relate to. Calculus Two covers series. If you like mathematics, you’ll probably enjoy it, but I view the material as mostly optional. You’ll come across telescoping series in some analyses of algorithms, though.

A much more important topic is discrete mathematics. Unfortunately, a MOOC on that topic is still missing. Thankfully, MIT OCW offers a great course in Mathematics for Computer Science, with fabulous lecture notes. There are several versions available. The Fall 2010 version has video lectures, while the Spring 2010 version comes with a very good set of lecture notes.

Lastly, there is linear algebra. I did have high hopes for Coding the Matrix: Linear Algebra through Computer Science Applications by Philip Klein. Unfortunately, this course was an utter disappointment. There were countless technical problems, and occasionally poorly worded questions in the assignments. It was not uncommon that I spent more time trying to please the autograder than actually solving the programming exercises. I also remember one particularly unpleasant Saturday afternoon where I was puzzling over autograder feedback, only to later learn that there was a problem with the grading script that rejected correct solutions. I hope that those issues will eventually get sorted out. An even bigger problem, though, was that the lectures weren’t very good. Philip Klein literally read the dense text on the slides to you, line by line. This was arguably the worst presentation of the roughly two dozen MOOCs I’ve either audited or completed. (I did earn a statement of accomplishment, in case you are wondering, but it was a real drag.)

The big draw of Coding the Matrix, computer science applications, turned out to be much less exciting in practice. You’d work on toy problems that illustrate, say, Hamming codes or image transformations, but the scale was so small that you walked away being thoroughly unimpressed. Of course, we were using a slow interpreted language like Python, and working on small problems. I would have much preferred to have been properly exposed to linear algebra, and then shown realistic applications. Alternatively one could have used highly-performant libraries so that you could have solved moderately sized problems.

EdX has an upcoming course that seems to move more towards that direction, though:
Linear Algebra – Foundations to Frontiers. Then there is also Linear Algebra on MIT OCW with fabulous lectures by Gilbert Strang. He is an enthusiastic lecturer, and he develops the material properly, which makes it easy to follow the material. A further bonus is that he made the linear algebra textbook he wrote freely available as a PDF. However, going through a course on MIT OCW might require more motivation and determination since there are no fixed deadlines, and no automatically graded exercises or exams.

If you’re fine with a less traditional way of teaching mathematics, you could also make use of Khan Academy, which covers calculus, statistics and probability, as well as linear algebra. There is currently very little discrete mathematics on offer, though.

Now we’ve got basic CS, programming, data structures and algorithms covered. There is only one course missing to complete the content of a typical CS minor.

Systems

To round off your basic CS education, one course in “systems” should be added to the mix. Such courses seem to be much more common in traditional CS programs that grew out of engineering departments, while they are either electives or wholly absent in other CS programs.

I’ll admit that I have a poor background in systems, with only one project-based university course under my belt that I didn’t consider thorough enough. This is therefore an area I intend to explore further. The two most interesting options seem to be by the University of Washington and the University of Texas, Austin. I didn’t have enough spare time when it was first offered, but I had a look at the materials, and I got a very good first impression of the University of Washington course The Hardware/Software Interface. A related course is the upcoming EdX offering Embedded Systems – Shape The World.

Specializations

With the requirements of a CS minor out of the way, what would you want to go on to study? I’m quite amazed at the wealth of offerings. Of course you won’t find any cutting edge research seminars online. If you’re serious about CS research, then MOOCs are only a poor substitute, but most of what you’d find in a typical taught BSc or MSc program, as opposed to a research-based one, you can find online as well.

If you’re interested in knowledge that is more intermediately useful, pick Jennifer Widom’s thorough Intro to Databases course. It covers theory, and also a lot of practice. For anyone only wanting to learn a bit of SQL it’s overkill, though.

If networks are what interests you, then you can start by taking the University of Washington’s Computer Networks, followed by Georgia Tech’s Software Defined Networking.

Are you interested in learning more about programming languages and their implementations? In this case, there is a wealth of resources available, too. Peter Van Roy of the University of Louvain, author of Concepts, Techniques, and Models of Computer Programming, is going to offer a course on programming paradigms on EdX. You could follow this up with Dan Grossman’s fabulous course on Programming Languages. That course focuses on the elements of programming languages. A good complement to Dan Grossman’s course is Wesley Weimer’s Programming Languages: Building a Web Browser, which gives you a good foundation for a course in compilers. Wesley Weimer is another one of my favorite lecturers, by the way.

Computer science legend Jeff Ullman is about to offer his course on automata theory for the second time on Coursera. His colleague Alex Aiken teaches one on Compilers. This is another one of the courses I have not taken yet. The syllabus looks quite intimidating, though. It has one of the highest estimates for weekly workload of any course on Coursera, 10 to 20 hours, and judging from feedback on the web, it’s pretty accurate.

A hot topic at the moment is Machine Learning. Coursera lists courses from Stanford, the University of Toronto, and the University of Washington. EdX offers the Caltech course Learning from Data, which has a reputation for being the most rigorous online course in ML.

Traditional AI seems to have taken a backseat compared to Machine Learning in recent years, but it nonetheless has a strong online representation. Udacity now hosts the seminal “AI Class”, Introduction to Artificial Intelligence, taught by Peter Norvig and Sebastian Thrun. Sebastian Thrun also teaches a more advanced class on self-driving cars: Artificial Intelligence for Robotics. Alternatively, you could take the UC Berkeley course Artificial Intelligence on EdX.

Conclusion

While I’ve given an overview of core CS courses and a few specializations, there is a lot more you could learn. There are courses on scientific computing, computer architecture, cryptography, computer graphics, computational investing, parallel programming, and even on computational investing and quantum computing. The list goes on and on. I do miss a course on operating systems, though.

I merely wanted to highlight some of the larger areas of computer science, and show that they are already well-represented online. The selection is largely based on my personal interests. Still, I think my presentation convincingly conveyed that there is, a mere two years after the MOOC revolution started, an absolutely staggering amount of MOOCs available. Just look at the numbers of CS courses, generously interpreted, that are currently listed on the websites of the major providers! EdX counts a total of 18, and so does Udacity, incidentally. Coursera, on the other hand, lists 91. This is as total of 127 courses in CS, and this is not taking into account the many courses that are tangentially related to CS, like mathematics, or statistics and data analysis.

Poor Treatment of Recursion in Introductory Textbooks, and a Counterexample

In some computer science or software engineering degree programs functional programming plays a very minor role. There may be an elective course, or there may just be a short segment on recursion in CS101. I went through an introductory sequence that was taught in Java, which doesn’t lend itself to recursion. Probably as a direct consequence of the limitations of Java, recursion was only briefly covered as some kind of novelty concept in programming, and some students were quick to dismiss recursion as “useless” or “academic”.

If recursion gets only treated on a surface level it does look like little more than a novelty concept of little use, with the added disadvantage that it “makes code more difficult to understand.” Sure, the unfamiliar is often difficult to grasp, but the elegance of recursive functions should be self-evident. If you think about it, loops are much less natural. Instead of taking one step after another, you first measure how many steps you are going to take and then process data, and hope that the number of steps you initially determined is equal to the number of steps that is required for the data. This is why there are off-by-one errors in iterative programs but not in functional ones.

Many freshmen have a hard enough time understanding loops. Eventually they will view them as reliable, possibly because they don’t notice half their off-by-one errors, i.e. those that don’t throw an exception due to an out-of-bounds error. I think if you taught people functional programming first and only afterwards introduced them to Java or C++, they might view it as a step back. On the other hand, if you let them experience with recursion in a language that is not suited for it, they learn that it’s easy to blow up the stack, which leads to the formation of beliefs like the two following, which mere made by top posters on Stack Overflow. The context was that one guy posted about an issue he was having with Ruby, a multi-paradigm programming language that is hardly ideal if you want to program in a functional style because it can’t handle deep recursive calls well.

If you’re intentionally recusing 10,000 times, you’re doing it horribly wrong and abusing recursion.

In general, unbounded recursion is bad. Most recursion can be rewritten to be iterative.

The cause of the problem, though, was that Ruby is missing tail call elimination. Recursion itself is not the problem. A programming language with support for tail call elimination should handle recursion just fine. Both statements are therefore misguided. I’ve found that it’s not that uncommon that supposedly good programmers, but also university lecturers, hold similar views. Sometimes you really wonder what a CS or software engineering degree is really good for. By the way, recursion and iteration are equivalent, which means that you can theoretically rewrite any recursive function as an iterative function, and vice versa.

I’ve met quite a few people who view recursion as an obscure alternative to simple loops and “something you don’t do in production code”. I was therefore wondering where this view might come from, and skimmed some introductory text books. The index of Bjarne Stroustrup’s Programming: Principles and Practice Using C++ doesn’t mention recursion at all. Y. Daniel Liang’s Introduction to Java Programming has a short chapter on recursion, but you’ll only find very basic examples there. Python is getting more popular as an introductory language, so I had a look at John M. Zelle’s Python Programming: An Introduction to Computer Science. In that book, recursion is covered in the very last chapter, which doubles as an introduction to the analysis of algorithms.

Zelle uses well-worn examples like computing factorials and Fibonacci numbers or checking whether a string is an anagram. Towers of Hanoi is in there, too. Recursion is also mentioned when describing sorting algorithms like merge sort. The problem I see with this is that the simpler examples don’t necessarily show any advantage of recursion. Towers of Hanoi is too specialized, and it’s not quite clear how that knowledge will transfer to any program you might write yourself. Lastly, sorting algorithms are normally presented as pseudo-code, and recursion is just part of some. In other words, this approach doesn’t necessarily invite exploration.

What’s missing are more elaborate and somewhat realistic problems that show how recursion can lead to cleaner code, and code that is much easier to reason about. Typical textbook examples normally don’t provide this. I’m not sure that fractals make a good example either. Sure, if you’re going to write a fractal generator, you might revisit your textbook and check how recursion is used there. However, there isn’t much of an application for it.

In short, none of the introductory textbooks mentioned give the impression that recursion may be a viable programming concept, and that it’s worth thinking about whether a certain problem might better be solved recursively, instead of making iteration the default choice. I was pleasantly surprised, though, to find a few neat problems in Allen B. Downey’s Think Python: How to Think Like a Computer Scientist. This book is quite astounding. It’s free, short, and content-wise it runs circles around your typical 1000+ page tome that is full of filler material.

Here is exercise 12.6, taken from version 2.0.10 (May 2013) of Think Python. Prof. Downey frequently updates that book, so the exercise may be modified, moved, or even removed from the book in the future. Therefore I’ll describe it fully.

The input is a list containing a good 100,000 words of the English language, and the desired output is the longest word that can be completely reduced. In this context, “reducing a word” means that a given word remains a valid word, i.e. a word contained in the word list, after removing a letter. You can remove a letter from anywhere in the word, but you cannot rearrange letters. The goal is to eventually end up with a one-letter word.

The example in the book is sprite, which leads to the following reduction:

sprite -> spite -> spit -> pit -> it -> i

So, what is the longest word in the English language that can be reduced in this manner? Feel free to grab the text file and try to figure it out yourself.

Prof. Downey gives the following hints:

Write a program to find all words that can be reduced in this way, and then find the longest one. This exercise is a little more challenging than most, so here are some suggestions:

1. You might want to write a function that takes a word and computes a list of all the words that can be formed by removing one letter. These are the “children” of the word.
2. Recursively, a word is reducible if any of its children are reducible. As a base case, you can consider the empty string reducible.
3. The wordlist I provided, words.txt, doesn’t contain single letter words. So you might want to add “I”, “a”, and the empty string.
4. To improve the performance of your program, you might want to memoize the words that are known to be reducible.

So, how would you do it? Give it a try, and you might find that recursion allows to write the most important function in a very succinct and elegant way, even though a large part of the program is iterative. Thus, it shows that recursion can be a useful addition to your toolbox, which is an impression you probably won’t get from computing factorials.

In my solution, which is reproduced below, I print out a list of the 20 longest words, but of course you can as well just return the first entry of the list.

def reduce_words():

    def create_dict():
        fin = open('words.txt')
        res = {}
        
        for line in fin:
            word = line.strip()
            res[word] = []
            
        for element in ["a", "i", ""]:
            res[element] = []

        return res

    def add_children(d):
        for key in d.keys():
            children = []
            for i in range(len(key)):
                candidate = key[:i] + key[i+1:]
                if candidate in d and candidate not in children:
                    children.append(candidate)
            d[key] = children
            
        return d

    def recursive_trails(d):
        res = []

        def helper(key, result):
            if d[key] == []:
                return
            if key in ["a", "i", ""]:
                res.append((len(result[0]), result))
            else:
                for entry in d[key]:
                    return helper(entry, result + [entry])
                
        for key,value in d.items():
            helper(key, [key])

        return res

    def top_n(results, n):
        results.sort(reverse = True)
        for i in range(n):
            print results[i]
                    
    d = create_dict()
    add_children(d)
    trails = recursive_trails(d)
    top_n(trails, 20)

reduce_words()

I thought that this was a really neat exercise that showed that you can use recursion in a pretty realistic problem, and end up with pretty elegant code. If only examples like this were included in standard introductory textbooks, or if instructors would use Prof. Downey’s book. But the textbook industry is shady, here is just one example, and the higher education industry complicit, so it may require another two or three financial crises until we see substantial rates of adoption of free high-quality textbooks.

Metacritic and the Legitimacy of Video Game Journalism, Part II

In the last post I wrote about my perception that video game journalism is little more than advertising in disguise. In this post, I’ll discuss numbers that may support my opinion. I use data gathered by the website Metacritic. For those who are not familiar with it: Metacritic collects “expert” reviews of various entertainment media, including video games, normalizes their scores, and posts an average. Users are also able to rate games, which then leads to the games in their database having both an expert and a user average.

One objection you could now voice is that user opinion may be unreliable because it might primarily be people who either dislike a game and give bad reviews, or who love a game and like to see it having many recommendations. Sure, both camps may have their agenda. However, there are PR guys who pose as users, and try to inflate scores, like it happened with the recent Star Trek game. I don’t think any company would pay a PR agency to plant negative reviews, so while you might question the legitimacy of user scores, it’s probably better to view them as slightly inflated due to corporate meddling. Therefore, “true” user scores for big-budget would probably be slightly lower than they are. I do agree that the metrics are hardly reliable, and that it’s questionable whether you could objectively grade a game on a ten scale anyway. People have different tastes, and some people are more forgiving of gameplay flaws than others. Still, as a voice of “the people”, it may be interesting to see how they differ from the alleged experts.

I’ve conducted my analysis with data that was current as of August 2, 2013. Games that were released since then are not considered. Further, there may have been slight changes in some of the user scores, simply due to ratings that were submitted to Metacritic in the mean time. I only focus on Xbox 360 games, but I don’t think there would be a fundamental difference when analyzing ratings for PlayStation 3 or PC games.

Metacritic holds records for 1505 Xbox 360 games. I did start with some rather basic analysis. My aim was to uncover discrepancies between users and “experts”. The first few examples may not be overly interesting, but there is a climax towards the end, so please keep reading.

Here’s a top-level overview of the data:

  • user rating < expert rating: 760 games (50.5 %)
  • user rating > expert rating: 683 games (45.4 %)
  • user rating = expert rating: 63 games (4.1 %)

This doesn’t look like much of a discrepancy, though, so I adjusted my parameters to check how the picture changes if I define “equal” to be a difference of +/- 5 %:

  • user rating < expert rating: 494 games (32.8 %)
  • user rating > expert rating: 443 games (29.4 %)
  • user rating = expert rating: 568 games (37.8 %)

Since this wasn’t particularly revealing either, I then moved on to have a look at the average user and “expert” ratings, considering all 1505 games:

  • expert average: 69.0
  • user average: 68.3

The difference looks miniscule. However, the video game industry is focussing on “AAA games”, i.e. games that cost a lot of money to make, and whose advertising budget is even higher than their astronomical production costs. Since I had the impression that it is primarily this category of games that receives inflated ratings, I changed my script to only consider games that were rated with at least 90 by the “experts”. This applies to just 33 games. Suddenly, the picture gets much more interesting:

  • expert average: 93.3
  • user average: 81.1

I’ll let you draw your conclusions about that when you consider that as a whole the difference between user and expert ratings is 0.7. It seems that as soon as big advertising money comes into play, the “experts” identify greatness where users see just another okay game. A difference of 12.2 points is quite staggering.

If I run the same script but limit the scope so that it only counts games that were rated with 85 or more by the “experts”, a total of 137 games, the discrepancy is still quite startling:

  • expert average: 89.3
  • user average: 79.5

A difference of about 10 points is not to be ignored either, but not quite as damning as the previous subset of games. Let’s now have a look at those fabulous 37 games that the “experts” thought were marvels of digital entertainment. The columns have to be interpreted the following way:

  • column 1: user rating minus expert rating
  • column 2: expert rating
  • column 3: user rating
  • column 4: game title

To make it perfectly clear what the data means, look at this line: “-38 93 55 Mass Effect 3”. This means that the users gave the game a 55, the experts a 93, and that the user score is 38 lower than expert score.

Here is the entire list:

-38 93 55 Mass Effect 3
-33 94 61 Call of Duty: Modern Warfare 2
-20 93 73 Street Fighter IV
-19 98 79 Grand Theft Auto IV
-18 94 76 Halo 3
-17 93 76 Gears of War 2
-15 91 76 Gears of War 3
-15 91 76 Super Street Fighter IV
-14 91 77 Halo: Reach
-13 92 79 Forza Motorsport 3
-12 93 81 Pac-Man Championship Edition DX
-12 93 81 Rock Band 3
-12 96 84 The Elder Scrolls V: Skyrim
-11 91 80 Forza Motorsport 4
-11 92 81 Guitar Hero II
-11 92 81 Rock Band
-11 94 83 Gears of War
-11 95 84 Portal 2
-10 94 84 Call of Duty 4: Modern Warfare
-9 92 83 Rock Band 2
-9 94 85 Batman: Arkham City
-8 92 84 The Walking Dead: A Telltale Games Series
-8 93 85 BioShock Infinite
-8 93 85 Fallout 3
-8 96 88 BioShock
-7 93 86 Braid
-7 94 87 The Elder Scrolls IV: Oblivion
-7 96 89 Mass Effect 2
-7 96 89 The Orange Box
-6 91 85 Far Cry 3
-6 95 89 Red Dead Redemption
-5 92 87 Batman: Arkham Asylum
-4 91 87 Mass Effect

It turned out that across the board users were less impressed than professional reviewers, and oftentimes dramatically less. Maybe you have played some of those games and asked yourself why they don’t live up to their glowing reviews. For instance, I was thoroughly unimpressed by Halo 3, and Grand Theft Auto IV I view as a game with many issues, but with plenty of fun moments nonetheless. The “experts” give the game a 98, the users a 79. The user score is closer to my perception of that game’s quality. The Gears of War games have a ham-fisted plot and okay gunplay, and don’t think they deserved the praise it got. The third part had some really great moments, though, but also much more ham. The by far best third person shooter I played on the Xbox 360 was Binary Domain. The critics gave it a 74, the users an 81.

If you wonder whether you should trust video game reviews, then you now have some good justification why you shouldn’t. Nowadays I ignore mainstream reviews and look for the consensus in niche communities, like on message boards for people who play STGs. This is much more helpful than the writeup of some “expert reviewer” who is forcing himself to type a few hundred words about a game he may not even have finished or hasn’t even understood the controls of. My personal consequence is that I buy fewer games, and if I want to play some “AAA title”, I wait until I can pick it up for cheap to minimize the amount of money I might potentially waste. If there are more people like me, then the money hatting strategy of the big publishers might not be as effective as the MBA types who dreamt it up think. The rise of indie games might suggest that people are getting fed up with “AAA gaming”.

For a more uplifting conclusion that supports the view that fan feedback can be very valuable with regards to games that are under-appreciated by professional reviewers, let’s now look at 20 titles fans enjoyed but critics despised:

33 48 81 Otomedius Excellent
33 44 77 Warriors Orochi 2
31 52 83 Samurai Warriors 2
30 57 87 World of Outlaws: Sprint Cars
29 53 82 Tetris Splash
29 42 71 Venetica
27 49 76 Dynasty Warriors: Gundam 2
26 63 89 Triggerheart Exelica
26 53 79 Samurai Warriors 2 Empires
26 49 75 Puzzle Arcade
25 58 83 Vigilante 8: Arcade
25 53 78 Ecco the Dolphin
24 64 88 Project Sylpheed: Arc of Deception
24 61 85 Dynasty Warriors 5 Empires
24 60 84 Sonic Adventure 2
23 63 86 Serious Sam 3: BFE
23 62 85 WRC 3
23 54 77 Rise of Nightmares
22 53 75 Warriors Orochi

I recognize two STGs, Otomedius Excellent and Triggerheart Exelica, plenty of samurai games, and a couple of classic games, or reinterpretations of classic games. People seem to have a lot of fun with those, even though the “experts” don’t get them. As a cynic, I’m tempted to conclude from those data that it doesn’t make a lot of sense to let hats full of money go around when the game you’re about to release only appeals to a more “hardcore” audience, and its sales potential is comparably low. On the other hand, if you are marketing Halo, then you can be a bit more generous and discredit video game journalism even more. Just look at Geoff Keighley:

Geoff Keighley, a paragon of integrity in video games journalism

Geoff Keighley, a paragon of integrity in video games journalism

But, hey, the man’s got to eat, too! Who knows, maybe he’ll one day figure out how to play a video game despite having one hand buried in a bowl full of Doritos, and holding a Mountain Dew in the other.