Let op! Internet Explorer wordt niet meer ondersteund. Hierdoor kan de website mogelijk niet goed functioneren, gebruik een alternatieve browser om optimaal gebruik te maken van deze website. Klik hier om een alternatieve browser te downloaden.

Challenging the “Fix Twice” Principle

21 September 2022

By Paul Jansen, CEO of TIOBE Software

More than 15 years ago, Joel Spolsky wrote an excellent blog on customer services. I have read this blog various times through the years because most of his “seven steps to remarkable customer service” are still valid today. It is like a reminder to me to make sure the customer service of our company is still doing great. There is one principle that stands out according to me and I promote it to everybody: fix everything two ways. This principle is about an urgent customer problem. The idea is to fix it immediately with whatever means you have, duct tape, a quick hack, various reboots, you name it. The second fix is the real fix. The idea is that by carrying out the quick fix you have won customer time to do the real fix.

What I often see is that only the immediate fix is done: customer happy, so we can go back to our more important work such as adding new features to our product. I can accept that because of time pressure. But we all know that this is the high way to technical debt. That’s why I like to quote Joel Spolsky’s fix twice principle whenever applicable. But today I had to question myself. Let me explain.

This morning, a customer submitted a bug report about our C code checker. Before telling my new insight about the fix twice principle, let’s have a look at what went wrong with our C code checker first. Our customer ran into an “uncaught exception” while running our code checker for his C program that is compiled with the WindRiver Diab compiler. Such an exception is something that happens only once or twice a year for this code checker that we run for more than 500 million lines of industrial code per day. Uncaught exceptions should never happen. The code checker might die, but then it should happen in a graceful way with a clear error message telling you what went wrong. It turned out to have something to do with a very special inline assembly construction followed by a normal C struct. And, Murphy’s law, this “uncaught exception” didn’t trigger our automated tests. So you can imagine that this was a very curious situation.

I could have applied the fix twice principle here. First, find out why all this trouble was caused in the C code checker, fix it, merge it, run all the tests and deliver it to the customer as a hot patch. And then, later on, do the real fix. But here the problem actually started. Let me explain. There were 3 layers of fixes to be done (in descending order of importance from a customer’s perspective):

  1. The real problem
  2. The fact that the problem caused an exception instead of a human-readable error
  3. The fact that the automated tests didn’t treat this “uncaught exception” as a failure

The only possible quick fix was solving the real problem (issue 1). But by doing this, the other two issues would have been solved as well: there would have been no “uncaught exception” anymore and no more tests that should have failed but didn’t fail. As a consequence, solving issues 2 and 3 would have been hardly possible anymore once issue 1 had been solved. But hey, we should do the quick fix first according to the principle. And here I diverted. What I did instead was solve the problems in reverse order of importance. First issue 3, then issue 2, and finally the real problem issue 1. This meant that the customer had to wait longer, but it seems to me the only right way to perform a fix twice, or actually a fix three times in this case.

So what did we learn today (Kaizen)? For sure the “fix twice” principle is very important. But it is important to keep in mind that things should be fixed in the right order to make sure that fix twice is indeed possible. And that right order is in some cases the reverse order of importance.\

Fun fact: there was actually a fourth issue in this situation:

4. Write a blog about the order of fixing things twice

And since this 4th issue was the least important, I started with that one. If I would have delved into the other 3 issues, then I could have lied to myself that I would write this blog sometime. By putting this blog first, it had to be done, and with urgency, because the customer is waiting for the real fix at the moment. 

 

Paul Jansen

Chief Executive Officer

Paul Jansen (1967) graduated at the University of Amsterdam in computing science and philosophy (both cum laude). At Philips Research he was computer scientist in the field of compiler construction and domain-specific languages. After a brief stay at Atos Origin and QA Systems, he founded TIOBE Software in 2000. Paul Jansen is the driving force behind the definition of the TIOBE Quality Indicator (TQI) and the famous TIOBE index that is published every month.

Connect to Paul Jansen on LinkedIn
Paul Jansen, Chief Executive Officer driving TIOBE Quality Indicator TIOBE index TIOBE