TIOBE's definitive guide to quality gating software code
Now that most organizations in the software industry have embraced continuous integration, we are ready for the next challenge: quality gates. Continuous integration has brought us highly automated software deliveries. That is a great thing, unless the quality of these deliveries is insufficient. The most logical next step is to put a gate or a set of gates before these deliveries, which will prevent bad software from being released. In this guide, we will explain how to set up software quality gates in the most efficient way, based on many years of experience with this topic. Bad quality gates can cause a lot of frustration in an organization, so it is advisable to get it right from the start.
The first location, in the development process during a pull request, is the most popular and well-known one. It is very effective because it is early in the development process, but it also has some important caveats. Quality gates as part of a pull request should be fast, robust and easy to fix. Let’s have a look at these 3 constraints in more detail:
The second location is a bit later in the software development process. This is after delivery, during the nightly build. At this location, there is less time pressure, which means we can focus on the slower metrics to be checked. The most important constraint at this location is that the extra check should have a lot of added value because if this gate fails it will go back to the engineer one or more days after his/her delivery. There must be a very good reason to demand repair that late in the game. An example could be a deep flow analysis that identified a memory leak or a null pointer exception, i.e. fatal errors that are hard to detect with fast tools.
In summary, fast, robust and easy-to-fix metrics should be part of the pull request quality gates and slow but extremely powerful metrics should be part of the nightly build gate.
Quality gating can be done with absolute or relative targets. An absolute target is a threshold that may not be exceeded, e.g. no compiler warnings allowed at all or the code coverage should not drop below 60%. Absolute targets are tempting to use because they are usually clear and ambitious. However, in most cases they won’t work very well in practice. Suppose you have a lot of legacy code, e.g. lots of existing compiler warnings. You would first need to get rid of existing issues not caused by yourself before you can deliver. Fixing issues not introduced by yourself is a risk because you might not know why these issues are in the code. Another disadvantage is that absolute targets might not cause improvements. If your code coverage is 65% and 60% is the absolute target, you are invited to deliver code without decent unit tests.
Relative targets, on the other hand, are a blessing for everybody. Engineers only need to fix issues introduced by themselves, which approximates the famous software Boy Scout rule that you should leave the code better than you found it. Management also embraces this approach because it minimizes effort and risk. No unnecessary gold-plating, only fix what has been broken during a change.
In summary, relative targets are the way to go.
Now that we know where and how to quality gate, the remaining question is: what metrics are we going to quality gate? This appears to be quite intricate. Suppose you have complex code and have the brilliant idea to set a relative blocking quality gate on cyclomatic complexity as a nice way to make the code simpler over time. Your idea will probably not turn out as you expected: one day a bug will be detected in the software and the fix is to add an extra “if” statement. Now you are in trouble, because you are not allowed to deliver this change, because it would increase complexity. In other words: you have fixed a bug and the
quality gate fails. That is not why we introduced quality gates.
But it becomes even more intricate. Suppose you decide to quality gate code coverage. Every time you deliver changed code, your unit tests must become better. One day you are asked to remove an old unused feature from the code. The result is that you remove a lot of code.
If that old code has high code coverage, you are not going to pass the gate because the average dropped from 75% to 74% due to your change, even though you improved the code by removing the old stuff. Experience has shown that the kind of metrics that really fits the bill are violation-based metrics. Examples of violation-based metrics are coding standard violations, compiler warnings and security issues. The advantage is that if you remove some code, it will not increase the number of violations. If you change some code, you are the only one introducing new ones, so you are fully in control and fully responsible. In summary, choose violation-based metrics to quality gate.
Apart from the trade-offs discussed above, there is still one decision to take. Will the quality gates be blocking or only informational? Both kinds of gating have their advantages and disadvantages. Informational or soft quality gates let engineers always pass. Nobody is annoyed or held up. The downside is that decreases in quality will slip through unnoticed. Hard or blocking gates are merciless but they ensure improvement. We see that one should always start with soft gates to get familiar with the concept, but sooner or later the gate should become a hard gate. Otherwise, you are not taking yourself seriously.
Quality gates are a great way of improving software quality. In this guide, we discussed ways to introduce such gates together with some do’s and don’ts. We highly recommend to have blocking/hard quality gates for violation-based metrics. If they are fast and easy to fix, make them part of your pull request. If they take more time, make them part of the nightly build. And even this proven quality gate strategy has its flaws sometimes. Suppose you encounter a false positive. The blocking gate will remain unforgivingly blocking in such a situation. Luckily, we have some band aid for that situation as well, a so called ‘waiver process’, but that is something for another time.