Cryptographic Block Ciphers in Functional Programming: A Case Study on Feldspar and AES

Last semester I had the opportunity to do an independent research project at Chalmers. Abstract and link to the repository as well as the final report are below.

Title: Cryptographic Block Ciphers in Functional Programming: A Case Study on Feldspar and AES

Supervisor: Michal Palka

Examiner: Mary Sheeran

Cryptographic block ciphers are defined as mathematical functions, and thus a prime candidate for implementation in functional programming languages. We implement the block cipher Rijndael, which was selected as the block cipher used for the Advanced Encryption Standard (AES), in Cryptol, Haskell, Feldspar, and C. We analyze how well Feldspar, a language originally designed for digital signal processing, is suited to this task. We highlight relative strengths and weaknesses of Feldspar for implementing cryptographic block ciphers, and suggest possible improvements to this language.



The Value Proposition of the Georgia Tech-Udacity Online Master of Science in Computer Science (OMS CS)

I used to be bullish about MOOCs, i.e. massive open online courses. That was until they stopped being open by erecting paywalls, and instead of striving towards replicating university-level education online instead watered down the material, chopped up single courses into entire course sequences, and put a highly disingenious marketing spiel on them. The Georgia Tech-Udacity Online Master of Science in Computer Science (OMS CS) seems to share many of the same downsides of MOOCs, while the upsides are largely imaginary. Let me elaborate.

First, I want to be clear that this article is based on a Central European background, where education works a bit differently than in the US or UK. Attending university is not nearly as common as in the US. On the other hand, we also lack thousands of insitutions of questionable reputation. This is sometimes flippantly summarized in the statement that the US may well have most of the 50 best universities in the world, but they also have a lot of the 5000 worst. In Europe, a decent Master’s degree is supposed to introduce you to research. At the very least, you will produce a lengthy research-oriented thesis, which may very well be a stepping stone to a future PhD. Coursework-only MSc degree programs, like the OMS CS, are rather unheard of.

In fact, just taking a certain number of taught courses and receiving a piece of paper in return seems like a rather dubious value proposition. After all, the value of a degree program is hardly in the sum of the courses you have taken. Particularly, if those courses only cover standard material, which you could easily have picked up yourself from a textbook, or MIT OCW. Instead, I see most of the value in getting at least some exposure to ongoing research, and maybe even having the chance of participating in research. On-campus students at Georgia Tech certainly have this possibility. However, students in the OMS CS do not. Thus, they are clearly second-rate citizens.

Obviously, research is constantly progressing. This has the consequence that beyond the undergraduate level courses are hardly static. If you taught a course on automata theory in 2016, you probably could have used the same material in 1996, and even in 2036 you or your successor would do fine with it. The reason is that this particular field is bascially a dead end for research, if I am not mistaken. In any case, there is certainly not a lot of funding in this area. To make it simpler, just take a basic course on data structures: linked lists, stacks, queues, trees and so on will be required basics for computer science majors until kingdom come.

On the other hand, at the graduate-level you are getting closer to the research frontier. In several of the courses in my CS MSc program we at the very least briefly touched upon ongoing research problems. In fact, it was not at all uncommon that professors would use at least one lecture to present their own research, which was, unsurprisingly, rather typical in courses that were advertised as joint MSc/Phd courses. Those normally included local and external guest lecturers as well. Furthermore, at least some recent papers are included in a typical graduate-level course. Sometimes you may even get access to papers that are under submission. As a consequene, the content of a graduate-level course is hardly static but will instead, albeit to varying degrees, depending on how far a particular field moves, change year after year.

In contrast, let us now look at what the OMS CS does. Frankly, it is rather off-putting, and, at least from my point of view, contradicts what graduate-level education should stand for. Instead of leading you closer to the research frontier, they instead record courses, conserve them, and broadcast them in frequent intervals, apparently with only little changes. Professors receive a one-off compensation for creating a course, and a fixed fee for each run. This alone should be reason enough to turn every intellectually curious student off. Besides, the Udacity courses I took were among the weaker MOOCs I have encountered, so the thought of having Udacity-style MOOCs as the main teaching vehicle makes me raise an eyebrow.

There are other downsides, the most important one being the lack of personal interaction. Sure, you may be able to post on a course forum, but this is worlds apart from meeting with a faculty member for a face-to-face dicussion. In fact, persoal interactions may be one of the most important aspects of a college education. No, I am not going to pretend that every student you will meet at the graduate level is highly motivated and has an IQ that is off the charts, or that every member of faculty will be a great teacher. However, you will interact with some ambitious and knowledgeable people, and you will learn a tremendous amount from them. This may be in the form of group or partner projects in regular courses, or in research projects, such as your thesis project, which will happen under the supervision of a faculty member, who may be a world-class expert in his field. A side effect is that you build your personal network. In that area, the OMS CS seems to offer very little.

During the OMS CS you will not get the chance to make many personal contacts. Consequently, you will most likely not be able to get decent references, and you won’t produce a thesis. Thus, the OMS CS will be of rather dubious value if you consider pursing a PhD at some point. Your coursework will not count for all that much in your application. Instead, demonstrated research experience is much more important, and good recommendations.

The saving grace for the Georgia Tech-Udacity Online Master of Science in Computer Science therefore seems to be that it allegedly boosts your credentials for work in the industry. Frankly, I am not even convinced that this is the case. If you want to work in a big company, then not having a degree will count heavily against you. However, the difference between an MSc and a BSc is rather negligible. In that regard, I am tempted to say that a coursework-only MSc ranks below a BSc that includes a thesis project. Further, an MSc with a clear research component surely ranks much higher than a coursework-only MSc.

Terminal Master’s programs are simply cash-cows for universities. If you are from Europe and consider it, I would wonder why you just don’t get an MSc at a decent local universtiy for free. In the US, though, the stigma is clearly that a terminal Master’s program is not leading towards research and simply repackages undergraduate courses at a steep price, even though that price tag is reduced for the OSM CS. Normally, the MSc is a consolation price if you decide to, for whatever reason, not continue with a PhD. You were still paid for it. On the other hand, a terminal MSc, particularly a coursework-only degree, signals that you had to pay for what someone else was able to get for free.

To wrap this up, let’s just look at the numbers. Of 8,000 applicants, 3,000 were admitted. Admitting such a high percentage of applicants does not look good. Sure, waive the “widening access” flag all you want. However, the problem remains that at the graduate level you need to have a higher bar, because just a few poor students will worsen the experience for the other students. For an extreme example of this, look at the discussion forums of a typical MOOC! There is so much noise that it’s next to impossible to have a decent discussion. However, if you look at the fact that Georgia Tech charges not only for each coures, but also for each semester you are enrolled, it makes perfect fiscal sense to admit even students who are not likely to graduate. In summary, the OMS CS is certaily great for Georgia Tech and Udacity, but I am not at all convinced that it is a smart choice for students. If you consider this program, then don’t be blinded by the Georgia Tech logo, but instead consider the financial cost of the program in relation to what you will gain. To me it seems like a rather bad deal.

Greed has ended the promised MOOC revolution


In this post, I will critically reflect on the MOOC phenomenon. I will briefly retell the recent history, discuss their shift towards monetization as well as the cynical behavior that emerged in this context, and conclude with my view on the value proposition of MOOC providers. I will start with my personal background, as it relates to MOOCs.

I have completed over 40 MOOCs within roughly the last three years. My main focus was on computer science and related subjects, but I also took a substantial number of courses in other disciplines. Those courses were a great complement to the software engineering Bachelor’s program I was studying during that time. Initially, I was a great supporter of MOOCs, despite quickly noticing their shortcomings, such as automatic grading and peer reviews. I even readily agreed to be a Community TA for one course at Coursera in early 2014, in order to give something back to the community. These days, though, I would no longer volunteer for any for-profit MOOC provider, after seeing what the major players Coursera, edX, and Udacity have turned into.

A brief history

Stanford offered three university courses online in 2011, including a very popular Artificial Intelligence class by Peter Norvig and Sebastian Thrun. The astounding public response was arguably fueled by the prospect of being able to get access to the same materials as Stanford students, sans personal interaction with teachers and teaching assistants, and earn a PDF certificate as proof of your attendance. Online courses had been available for a long time, though. The most prominent provider was MIT with their OpenCourseWare (MIT OCW) platform. The downside of it is that if you wanted to recreate the learning experience of an MIT OCW course, you would have to grade your own assignments and exams, which is of course a ludicrous proposition. Thus, as an autodidact, you were hardly better off compared to working through a textbook at your own pace. At least in computer science and mathematics there are a lot of high-quality textbooks available, which often include answers to at least some exercises.

MOOCs seemed to want to improve the lives of autodidacts. Indeed, a very large part of early adopters were people who had already completed college degrees. A lot of the earlier courses were furthermore reasonably rigorous. There were shortcomings with regards to the assignments, since those tended to rely a lot on multiple choice questions, which entails simplifying the assessment. More extensive projects were graded by your peers, which is a far cry from having a teaching assistant provide personal feedback, like it is the case at university. Interactive exercises are great to have, as you get immediate feedback. In short, if you went through a MOOC, you got a pretty decent experience, as long as you took courses that followed textbooks, provided extensive lecture notes, and are in subjects that do not require specialized equipment.

Going sour

In 2012 MOOCs seemed to be too good to be true. Sure, there were hardly perfect. The limitations seemed rather substantial. All drawbacks can be solved, and have been solved. Particularly edX seemed to focus on high-quality courses, aiming to get as close to the class room experience as possible. In some of the courses I took, even course books were made available for free, albeit in a cumbersome interface. If you find an edX course offered by MIT that has a counterpart on MIT OCW, you may find that the content on edX is in no way watered down. I probably should not make a hasty generalization. Thus, I’d like to modify the previous statement to say that the few courses I compared on both platforms seemed more or less equivalent.

I was particularly impressed by some of the innovations of edX. For instance, some MITx courses have the option of providing feedback from an MIT teaching assistant — for a substantial fee, of course. In some of the computer science courses, the assignments are fairly substantial, going beyond what I have seen at Coursera or Udacity. I have not encountered any complex projects, as they would need more guidance than current MOOC interfaces can provide. There are open challenges, like administering group projects, which are a common in CS education. I am not aware of any attempt of replicating this in a MOOC.

Of the three MOOC providers I am most familiar with, Coursera, Udacity, and edX, it seems that only edX is interested in reaching university standards, with the aforementioned limitations, of course. Udacity has been going the trade-school way, offering “nano degrees” in specialized subjects like mobile app development or web development. The Udacity courses I have taken were okay, but I would not at all be interested in paying for them, as they had too many drawbacks, and seemed very much focussed on applying knowledge instead of fundamental understanding.

Coursera has been undergoing a rather dramatic shift. At first, it seemed that they were aiming for academic excellence. I liked their focus on the sciences, and in particular their broad offering in computer science. It was unfortunate that many courses would only run once a year. This problem is currently being resolved, considering that a lot of courses on Coursera have been remodeled into self-paced versions. This was the only positive change I can see on that platform. Otherwise, I am deeply disappointed. While there used to be a fair number of standard-length courses that were clearly based on their campus counterparts, the current situation is much different. There were instances were single courses were broken up into several much shorter ones and repackaged as a “specialization”.

Of course innovation comes with a price tag at Coursera. I will talk about the value proposition of MOOCs a little later. For now, let me only state that Coursera seems to have taken some inspiration from the kind of sites that pop up on your screen if you don’t suppress ads in your web browser. You know, those with long and fancy sales letters and bullet points that reiterate what a great deal their dubious product is. Coursera’s Data Science specialization is currently available for $470, and consists of not one but a staggering ten courses. However, those ten courses do in no way correspond to ten proper university courses. In fact, they used to be one or two longer courses. This reminds me of bloggers who want to sell you a bundle of ebooks, where each “book” contains 20 to 30 pages with just a few dozen words.

Lastly, edX has not been inactive on the monetization front either. While it at first seemed there was a focus on standard-length courses and adding a price tag to them, you nowadays do see quite a few Coursera-style offerings as well, with “XSeries” programs consisting of courses that each are four weeks long. Again, like we have seen it with Coursera, splitting a 10 or 12 week course into three courses seems to have been done with an eye towards profit maximization and possibly the intention to deceive customers. Thus, keep in mind that most if not all course programs do in no way reflect the pacing of a typical university course.

The value proposition

An appealing aspect of MOOCs was that you were rewarded with a PDF certificate that indicated completion of the course. This was hardly comparable to getting a new entry on your university transcript after passing a proctored exam at a brick-and-mortar institution. Still, it provides some evidence that you have studied the course materials. Cheating happens at universities, too, after all.

I never saw any value in paying for a “verified” certificate. Apparently, this sentiment was shared by a high enough number of other MOOC users. Apart from the dubious value proposition, it was also an aesthetic one, as I considered the design of the non-free certificates at both Udacity and edX, with their tacky ribbons and huge logos, far less aesthetically pleasing than the much simpler free certificates. Demand for verified certificates apparently wasn’t what it should have been, considering that the free alternative was infinitely better value for money. Thus, the three major MOOC providers no longer offer free certificates. Udacity was the first entity that did so, in early 2014, because they wanted to “maximize the learning outcome for our students“. Their first instructor, Dave Evans, whose gave the Introduction to CS class, disagreed with that decision, for a while, apparently issued certificates himself, until Udacity told him to stop. Coursera followed suit in October 2015, but everything is fine because they “remain fully committed to [their] financial aid program“. Apparently, just stopping issuing free certificates did not have the desired effect on autodidacts, so nowadays you cannot even access assignments without paying their “small fee”. This makes their computer science courses now useless for anybody who does not want to part with his money. In December, edX apparently did not want to feel left out, so they had to “ensure that edX certificates carry the merit our learners deserve“, and axed the free honor certificates, which were previously praised as a great motivation on their homepage. Why can’t they all just openly say that they want to improve their bottom line, maybe not in those words, instead of making such bullshit statements?

Now that you have to pay if you want a certificate, one has to ask what the value proposition of MOOCs is. It is of course very cynical that the big providers still claim they offer “free” education, when, particularly in the case of Coursera, the freely available content is heavily restricted. Improving the value proposition by eliminating the free option, and still claiming that you offer “free” education is quite something, though. Further, edX amuses me by asking for donations and pestering me to pay for certificates. At brick-and-mortar universities they at least let you graduate before you get calls, emails, and letters from their fundraising office.

From my perspective, as a European who is used to high quality education for a very low cost or entirely for free, the suggestion of paying for a lesser product is dubious at best. Also, I don’t look particularly favorably at the deceptions by Coursera and edX who split up regular courses into several smaller ones, which primarily serves to mislead the customer. Sock puppets would of course respond that caveat emptor applies. Scammers use the same defense. So, let’s do some basic math! As of today, I have completed 43 MOOCs. Some of those courses have been, in the meantime, split into several smaller ones. So, let’s say my 43 courses are equivalent to 50 “courses” as they are currently offered. Those courses are by no means equivalent to a college degree, but I would not be surprised if you could get a BA in an information technology degree at a second-tier university for less effort. Coursera and edX have several price levels, ranging from $49 at the low end, all the way to $99 and $149 offerings for courses. Let’s be generous and set an average price of $89 for a verified certificate. This is about $4,500 in total — for a bunch of PDFs! To put this sum into perspective: for a comparable amount of money you can cover the cost of a high-quality and respected distance-learning degree from the University of London.

In the end, greed won. Of course you can turn this around and call me “greedy” for taking so many MOOCs. Well, here is the kicker: any non-technical “certificate” has very little value to begin with, apart from possibly indicating that you have an interest in a particular subject. Thus, their value is approximately zero. The value of an arts degree from a brand-name university comes primarily from the brand-name of the university, and to a much lesser degree from what they teach you. Further, the value of a technical education, even at a no-name institution, is primarily due to the knowledge you acquire. With a technical degree from a second-rate institution you will have a hard time getting into Wall Street or this year’s hot Internet companies, but you will find employment relatively easily, unlike people with non-technical background who have far worse career prospects. This means that a certificate of a MOOC even for marketable skills is relatively useless since the skill is more important than the certificate.

The fact that the big three MOOC providers abolished free certificates seems to clearly indicate that they were unable to differentiate free and paid-for certificates. Thus, by reducing the options of their users, they hope to increase their revenue. As I wrote before, the marketing and presentation is becoming misleading if not shady, for instance when repackaging one course as many, suggesting greater value than is delivered. The new target customers is far from the old one. No longer are autodidacts an interesting clientele. Instead, the ill-informed are the new target who may salivate at the thought of getting Yale or Harvard credential for a relatively modest price, possibly not realizing that they are going to acquire a useless product.

A Simpler Solution to the Dining Philosophers Problem


When I was first introduced to the Dining Philosophers problem, my initial reaction was that it is a silly problem with a touch of obscurantism. As I read, it was originally created by Dijkstra as an exam problem in the 1960s. Some teachers do indeed have a predilection for whimsical problem formulations, which more often than not, though, are distracting more than they are illuminating. Later on the Dining Philosophers problem was promoted to a canonical problem in computer science, arguably after Tony Hoare discussed it in his book Communicating Sequential Processes. Now it is a standard problem in any course on Operating Systems or Concurrent Programming.

In this article, I will first highlight the issue that the original problem is poorly formulated. Second, I will propose a solution that is simpler than the three typical solutions that are commonly taught, and similar in spirit to a fourth but non-canonical solution.

First of all, what bugs me the most is that this problem appears to be much deeper than it actually is, by giving it this utterly pretentious name, Dining Philosophers. It is as follows: there are five philosophers, five forks, and an infinite amount of spaghetti. For some odd reason, philosophers need two forks to eat. They cannot eat and think at the same time, and they all want to both eat and think. Since nobody eats with two forks, I’ll replace the forks with chopsticks, and the spaghetti with rice. I will still refer to them as philosophers, to retain the pretentious nature of this problem. “Guys with chopsticks” just does not have the same ring as “Dining Philosophers”.

Let me now run through the typical issues discussed in the context of the Dining Philosophers problem:

1) Deadlock: if every philosopher grabs just one chopstick, they all wait forever until a second chopstick becomes available.

2) Livelock: if every philosopher can only hold on to chopsticks only a given amount of time, and they manage to perfectly synchronize this, you end up with the situation that they all pick up and release one chopstick at the same time, over and over.

Canonical Solutions

The three canonical solutions are:

1) Resource hierarchy (Dijkstra): Number the chopsticks from 1 to 5; introduce the convention that each philosopher has to pick up a chopstick with a lower number first. This prevents deadlock because, after allocating four chopsticks to four philosophers, the last philosopher cannot pick up chopstick number 5, since 5 is the highest number.

2) Create an arbitrator: implemented as a mutex, an arbitrator gives permission to philosophers to pick up two chopsticks at a time.

3) Introduce the notion of cleanliness (Chandy/Misra): label philosophers with an id ranging from 1 to n. For each pair of philosophers fighting over one particular chopstick, give it to the philosopher with the lower id. Each chopstick can be dirty or clean, and initially they are all dirty. When a philosopher wants to eat, he needs to request chopsticks. Any philosopher who receives a request keeps a chopstick if it is clean, but if it is dirty, he relinquishes it. Before handing over a chopstick, he cleans it. After a philosopher is done eating, all chopsticks are marked as dirty.

Those are the three canonical solutions to the Dining Philosophers problem, but I came across a fourth one:

4) Remove one chair (Stallings): Take n philosophers and n chopsticks. Now, remove one chair so that only n-1 philosophers can take a seat.

A different, and simple, solution

I thought about the dining philosophers problem a bit, and came up with a solution that retains the original conventions. In that regard, it is unlike the fourth approach, even though it is similar in spirit, as it equally aims for a rather straightforward solution. Obviously, chopsticks can only be used in pairs. Thus, the solution is to simply let philosophers only take two chopsticks at a time — and we are done! Concretely, put all n philosophers in a queue P, and all n chopsticks in a separate queue C (a stack would work just as well). While C contains at least two chopsticks, the philosopher at the top of P takes two chopsticks from C and eats, and so on.

It should be obvious why this approach works. If not: The philosopher at the top of P takes two chopsticks from C, and starts eating. When he is done (feel free to have him eat until he receives a request to release his chopsticks; it does not matter), he adds both chopsticks to C again. Continue until all chopsticks have been distributed. If there are no two chopsticks available, the remaining philosophers wait for their turn. Thus, the first philosopher takes two chopsticks, and eats. Then the second philosopher takes the next two chopsticks, and eats. The third philosopher would only get one chopstick, for the time being, so he waits. Eventually, though, the first philosopher will be done eating, and subsequently put his two chopsticks back to C. Because you always take two chopsticks from C, and philosophers only eat for a finite amount of time, there is neither deadlock nor starvation.

[EDIT: On Reddit is was pointed out that the stack is the arbitrator (waiter) in disguise. This is correct. There was also an issue that the chopsticks are treated as commodities, while they should be distinguishable. For instance, philosopher 1 can only take chopsticks 1 and 2, philosopher 2 only chopstick 2 and 3, and so on. To salvage my approach, which now makes it even clearer that C is just the waiter in disguise, have it store pairs of neighboring chopsticks. A more complicated bookkeeping mechanism is needed for the chopsticks, though. Anyway, using that approach, philosopher 1 takes the pair (1, 2), and leaves the waiting queue. Philosopher 2 would like to take pair (2, 3), but chopstick 2 is taken, so he has to move to the end of the waiting queue. Philosopher 3 can take chopsticks (3, 4), though, and so on. Once philosopher 1 is done, he returns the pair of chopsticks, and moves to the back of the waiting queue, etc.]

Lastly, note that there is no need to have exactly five philosophers. Presumably, this number is only used due to the historical accident of Dijkstra using that particular number. You need at least two. Some of the solutions we have seen have an administrative overhead, so it will only get more cumbersome the more philosophers we add to the table. However, by using a queue, you could feed an arbitrarily large number of philosophers without any overhead. Also note that whenever there is an odd number of chopsticks, there is one superfluous chopstick, since you cannot properly eat rice with just one chopstick. Thus, there are at least three aspects in the original problem statement that obscure the problem: the number 5, the strange convention of using two forks, and the fact that the fifth fork is superfluous. Lastly, calling the diners philosophers is potentially distracting.

Retrospective: Software Engineering Summer Internship (2015) at Jeppesen Systems AB in Gothenburg


Jeppesen is a fully owned subsidiary of Boeing, and part of their Digital Aviation business. Concretely, they provide navigational information, crew and fleet management solutions, and offer an industry-leading optimization product. The Gothenburg office started originally as Carmen Systems, which was spun off Chalmers University of Technology. Ties between Chalmers and Jeppesen are still strong. The Gothenburg office of Jeppesen employs around 350 people, of which around 300 are technical staff. Further, there are around 60 contractors in the building.

Jeppesen was not an entirely new entity to me. I had been mentored by one of their managers for about a year, though whom I met several other Jeppesen employees, including a veteran software architect with a very strong technical background, and an experienced software developer. Further, I was one of the winners of a coding competition they held this spring, and one of five students who were invited to what turned out to be an intimate recruiting event, where we were outnumbered by Jeppesen managers and employees. Several managers represented their division, the challenges they are working on, and, of course, the kind of skills they are looking for in future employees. Alas, back then they did not have any openings, but we were told that a number of full-time positions would open up in autumn, and possibly some summer internships as well. Both turned out to be the case.

The interview

I applied for a data processing project with a parallel programming component, which I will detail further below. But before I got to do any work, I had to make it through the interview. Job interviews are often cringe-worthy, particularly the “your greatest weakness”-kind. The interview experience at Jeppesen was distinctly positive, though. The hiring manager briefly came in to shake hands and exchange some niceties, but otherwise, the interview was conducted by two senior engineers. They described the project, how they work, and the technologies they use.

What I greatly appreciated was that they were looking for competence instead of buzzwords. We even talked a bit about the necessary computer science background for their project. It was helpful that I had heard of recursive descent parsing, had written a parser for a subset of C++, or implemented a compiler for a C-like language. Normally, a coding test is also part of the interview process, but I presume this was skipped due to my performance on their recent coding challenge. Still, they asked me a few ‘design’ questions, like how I would approach this or that problem, or pros and cons of concrete approaches to given problems. Thankfully, it was more about how to break down a complex task into manageable subproblems, instead of the dreaded UML and design patterns interview.

The project

Of course I signed and NDA can therefore not be too specific, so I am only going to describe the problem in general terms. Let’s say your software runs on dozens or possibly even hundreds of machines at a customer site. Each copy of the program produces partly unstructured log files, recording all kinds of things. In order to extract useful information from those log files, you have to run a parser that understands the structure of the files, and extracts relevant information, which may be processed further. To make this even more fun, let the client machines run around the clock, and update log files continually. This sounds fairly complicated already. However, keep in mind that you have more than one customer, many more than one parser, and currently you only execute one parser after another in a sequential order via a cron job once a day, for historical reasons, maybe because the programs predate the many-core CPU era.

The goal of the project I was given was to write a framework that parses all log files — think in orders of magnitudes of tens of thousands of files — in pseudo-real time, so that you can extract more useful information from the existing data. For instance, if you take a metric like ‘active instances’ once a day, you’ll learn a lot less than if you are able to update this metric every few minutes.

Concretely, the framework I developed unified the existing parsers for several Jeppesen products. Understanding and modifying existing parsers required a little bit of background in programming languages. This was pretty fun, but the most interesting aspect was trying to find ways to effectively parallelize the execution of the parsers. The hardware at our disposal was pretty decent. Thus, running a single-threaded program on a many-core CPU is a bit of a waste, to put it mildly. My approach was to avoid all shared state. Once this problem was solved, the actual parallelization was easy to achieve. The machine I was mostly working on was an Intel Xeon with 16 cores and over 100 GB RAM, which was far from being the most powerful machine I had access to. Even on that one, we easily hit our performance target.

In my last two weeks, I had plenty of time to ensure the documentation was comprehensive. At the same time, my main supervisor put the project successfully into production at two customer sites, both major European airlines. At both sites we encountered minimal issues that were easily resolved. In one case, the volume was much greater than the fairly large test data directory I had been working with, so I needed to use a more efficient data structure, and make use of memoization at one point in order to resolve a bottleneck. At the other customer site, a slightly different naming convention for some files was used, which was an even easier fix. My last update was that this product has been successfully rolled out to two more customers.

The software development process

While Jeppesen is officially committed to “Agile”, they nonetheless have a strong engineering culture, which entails that they make use of software processes not for their own sake but when there is a clear benefit. The first few days on the job, I was expected to do the daily “stand up” song and dance, and report on everything I’ve done, what I intended to do that day, and give various estimates about the next few days. Personally, I found this more annoying than helpful. After I showed a proof of concept of this product after a few days, instead of the planned two weeks, my supervisors concluded that I can apparently work well on my own, and scrapped daily meetings. Instead, we had the occasional demo session, or just casually discussed progress during company breakfast.

I did like that, after proving myself, I was given relatively free reign with regards to the development, of course while still conforming to the overall vision of the product, which I received at the beginning in the form of a succinct document. The eventual software development process was somewhat reminiscent of Iterative and incremental development.

The work environemnt

Based on my experience at a previous employer, and from various company visits, I would say that Jeppesen has a clearly above average office environment. It’s not uncommon that three or four employees share their own, reasonably spacious office. However, there are also several open-plan office areas, the kind that is claimed to foster cooperation and communication, even though they are “damaging to the workers’ attention spans, productivity, creative thinking, and satisfaction”. Well, Joe Programmer, no matter where on the world he works, sadly does not command enough status to get his own office. Thus, companies are, quite literally, squandering money by not giving people access to (genuinely) quiet working environments.

Still, thanks to artificial dividers, plants, and a reasonable amount of space, the open-space areas at Jeppesen are quite okay. I was sitting in one of the larger ones in the building, among around 15 employees, all fairly well spread-out. The main office hours are a bit busy, with infrequent distractions. However, if you come in very early or stay late, you can enjoy a quiet working environment and get a lot done. During the busier hours of the day, you may notice decreased productivity, compared to, say, intently working on something in the privacy of your home.

A few perks are offered, too. Every morning, there is free basic breakfast, which is also a great opportunity to talk to people outside of your department. Further, bowls of fruit are placed all over the building. This being Sweden, there is also a strong tradition of having ‘fika’, i.e. a short break for consuming coffee and pastries. Noteworthy is that a masseuse visits the office in regular intervals, for which you can sign up at reception, and get the cost deducted from your pay. Regular employees are even entitled to some free money (friskvård) for that purpose, or other health-related expenditures.

Working hours are quite flexible. You are expected to put in 40 hours per week. As a rank-and-file employee you have to be present during core business hours (9:00 to 15:00), but you do have some flexibility otherwise. For instance, taking the occasional long lunch break, and staying longer some other days is certainly something you can discuss with your manager, just like taking half a day off and putting in a few more hours on some other days to make up for it. Overall, they try to achieve a good work/life balance.


This was a very enjoyable internship, and a summer well-spent. The end result was that I wrote, relatively independently, a medium-sized application weighing in at thousands of lines of code, all tested and well-documented. It was a good mixture of intellectual challenge and more mundane practical aspects. In the beginning, intellectual challenges were more dominant, like the entire design of the product, or figuring out effective parallelization. Later on, the focus was much more on polishing the code base, and eventually deploying it. The latter was not quite as mentally stimulating, but on the other hand it was immensely gratifying to see the project hitting or exceeding all performance targets, as well as its successful deployment.