heartbleed-bug-thumbBy today, I believe most people in IT world has at least heard about Heartbleed bug. While the solution of this bug is not super difficult, the situation became messy because it affected a huge number of Internet services and each one of those services must fix their own servers independently. Naturally, there is no way to “force” them to complete it with our expected timeline. Some companies published that they have fixed the bug. Some others claimed their services were never affected. However, there are still a large number of services who choose to keep quiet. Sure there are some online tools to test whether a site is vulnerable or not. But those tools would mostly useful only for tech-savvy people. Majority of “ordinary” people out there will remain in the dark, unless a company itself announces something.

Then what is the reason behind this massive security problem? Who is to blame?
In my personal opinion, Heartbleed bug is a real example of the limitation of Open Source model.

OpenSSL is an open source software and everyone is free to contribute to improve the software. For each improvement, a coder will add/change some part of the code, and submit it back. Ideally there is supposed to be some sort of checking mechanism, but we all know that people in Open Source community are mostly coders. And coders enjoy to code more than they enjoy to “test”.

So this guy who coded one particular change in OpenSSL did the change out of good will. His name is Dr. Robin Seggelmann. He’s not paid by anyone. He did it for free. He submitted his code hoping that it would improve on a widely popular Open Source project. And his edit got approved, so it became part of the next version of OpenSSL.

In his code, there’s a validation mistake. If you understand programming, that mistake is a very common mistake that almost every programmer would have at least few experiences making that kind of mistake.

Who is responsible? It’s not productive to point finger and searching for someone to blame in this situation. Because it’s an Open Source project. People use it at their own free will. And since it’s free, the users are not “customers”. Companies who choose to use a non-free SSL are not affected by Heartbleed bug. Companies who use OpenSSL (especially some particular versions) knows and understand the risks of using open source software and they choose to use it because it’s free. If a company is handling highly sensitive data (like banking) and it chooses to use it, that company IS responsible for their own decision.

If there is something to blame, it would be the lack of testing. Despite being super critical, it is not a secret that software testing has been severely under-appreciated by the world of software development for many years.

When I first wrote this opinion on an online site, someone who loves Open Source felt deeply offended and attacked me. After few posts of arguments, I concluded that in his ideal mind, anything that’s called “open source” is like nirvana. Pure and holy. And any attempt to point out a limitation or something to improve is a direct insult to his precious belief. In his unchangeable delusion, there is no way that something “open source” can be worse than anything “closed source”. So while I kept giving him some arguments on how the mechanism of open source project can be improved (focused on improvement), he kept replying with some comparison against closed source bugs (focused on denial and there is no need to improve).

I’m not against open source project. In fact, I have contributed to few projects myself in the past. So I can put myself in the stand of a contributor rather than just a user who merely want a free software.

In my personal observation, most of these people (who hate closed source software very much and would argue anyone who dare to touch their sacred open source) are people who never actually contributed to any open source project. They simply like to have free software, and fool themselves with the idea that every open source project is improved by the brightest minds in the world, so it must be super better in everything. In reality, SOME of open source projects indeed have great minds contributed to their improvements, but certainly not all. To make things worse, for some projects, the ratio between contributors and the users are far too imbalanced. OpenSSL is one clear example of an open source project where the number of contributors are way too small compared to its massive popularity. And I’m still talking about the total number of contributors here, not even touched the actual number of ACTIVE testers.

To be fair, this particular trend is NOT limited only to Open Source projects. Some companies adapting closed source development were also exposed for releasing buggy products. Some were minor and harmless, some others were severe and serious. It is rather difficult to actually “judge” closed source products as a unified entity because every company has different procedures of testing their software with different priorities and budget. However, it is my general impression that most of (not all) open source projects tend to make it easier for new codes to get accepted (with exceptions of few good open source projects that I know). Ideally, every new code is subject to a rigorous phases of testing and might need to go back for few cycles of fixes before finally make it to a release.

Sadly, many people in software industry still have this assumption that testing a software is equal to running it, open few modules and do few actions. Software testing is a very complicated sub-branch in the research of software development study. A proper set of software testing needs serious preparation, mathematical analysis and logic evaluation. To make it worse, it is mathematically proven that fully testing every possible inputs with every possible modules and features are simply not feasible in the sense of required time and resources, thus some calculated prioritization needs to be done.

Now let’s go back to the technical discussion of Heartbleed bug. The actual bug is caused by a module called memcpy in its code. As the name suggests, memcpy copies data. This function has three parameters: bp, pl and payload. Essentially they are data destination, data source, and size of data. Copying data from computer memory is not as simple as some might thought because at all times, there will always be some “bits” of data in any location of computer memory. So an algorithm can’t automatically detect if a particular address of memory is “empty” or not.

Ideally, if bp, pl and payload contains valid information, the module will copy a block of data from pl to bp without any problem. The tricky situation now arises if payload is a wrong number (not the actual size of data in pl). If pl is actually an empty block of memory (zero byte) but payload value says it’s 32kb for example. Then memcpy will create a 32kb container at the location of bp, to receive new data from pl. Since payload is wrong and pl is actually empty, no data is being recorded into bp. Computer recycles memory addresses because it’s not feasible to assign fix address for every data we have in memory. So the data in bp remains whatever information it contains before.

After the memcpy execution, ideally bp will contain data block from pl which come from client (Internet user who triggered the usage of memcpy). So it’s ok to pass the data back to client, because it’s his own data. When pl is empty and no data was copied into bp, the old data block in bp will end up being passed back to client. It means, client will receive block of data that it should NOT have. It can be garbage data, it can be sensitive information including username or password. THIS is the Heartbleed bug.

From the perspective of software testing, every new module must be tested using every possible inputs, trying to trigger every branches and loops inside the code, and produce every possible outputs. Since there are massively too many possibilities, then selection process must be done. For each new module, a tester should test it with some inputs, which each input must represent unique situation. For example, in the case of memcpy, tester must test it using empty bp, normal-sized bp, large bp. Then he needs to test it using empty pl, normal-sized pl and large pl. Then he must test the module using zero payload (honest), non-zero payload (honest), zero payload (dishonest), greater than actual payload (dishonest) and smaller than actual (dishonest). Since there are three parameters, each having few unique situations, tester must create test cases which contain every possible combinations of those situations.

After dealing with the size of data, tester must test the module with data that contains special characters, test the module if the process is interrupted, test the module is server is under heavy load, and many more tests. Only after a new module survives all those rigorous testing, then it can make its way to a release. This is particularly important for modules related to security, and even more important in such a popular Open Source project like OpenSSL.

Clearly this test did not happen.

It could be because the tester(s) were careless. Or there was no test at all. Or there was a test but did not include all those possible combinations of unique situations.

The actual bug is the lack of input validation. If you have done some programming in the past, have you ever had someone else (a non tech-savvy) person using your program in a clumsy way and it crashed your program? Most software developers would have at least one or two experiences where their code failed to handle some types “wrong” of user inputs. It’s a very common mistake in programming. And it should be the FIRST type of mistake to get caught by good testers.

Can we blame the testers? Not really. Because these people are also contributors. They do their job for free, for everyone to enjoy. Based on my personal experience as software engineer, most programmers enjoy the process of writing code much more than testing it.

What’s lacking here is a STANDARDIZED procedure of doing the test, and a mutual agreement that those testing steps MUST be enforced at all time without exceptions. It is a common issue that when the one submitting a code is a senior (or well known) contributor, some testers might jumped into premature conclusion that the code must be of high quality. This is the big NO in software testing. Some Open Source projects might have this nice standardized procedure. But clearly there are many others who don’t have it, or not forcing it hard enough.

Again, I’m not trying to bash open source project here. This article simply argues that for open source projects to improve, serious steps need to be taken to ensure standardized software testing procedures are taken before accepting any new codes. Why would I only target this to open source project? Because it IS the model that everyone of us can get involved. For company-based products, there’s not much we can do as they have their own procedures, for better or worse.

So if you love open source software and reading this, stop taking offense on what I wrote. Start to check yourself, what have you contributed to the world of open source projects? (other than “just” using the products free of charge and bashing people who prefer to use commercial software)

For everyone else, let’s learn from this massive security bug called Heartbleed bug and use it wisely to improve on any software development projects near to you. Open or closed source.