You can't add quality after the fact

Code reviews, testing later and refactoring later are one of the biggest disasters of the software industry

Jan 28, 2023

The problems

As Harold F. Dodge - one of the most important figures in the introduction of quality control - once said:

You can not inspect quality into a product.

It would be slightly overkill to say that it exactly applies to software products, but we are not far from the truth. Many software teams don't realize the real cost of not building in quality during the fact. Not starting with good quality will always cost much more in the end. Yet teams are trying to add quality after the fact. They try to inspect the quality into their product. With little to no success. We can’t just start out with quickly produced junk and then fix it later. It will just lead to poor practices that are just duct-taping the code instead of adding real quality. There are many so-called “best practices“ in the space that are widely accepted and became the de-facto way to improve the quality of software in the 21st century. These practices are deeply rotten in our processes. A sad reality. But there is hope.

Pull request (PR) reviews are superficial

PR reviews are one of these duct-taping practices. I don’t say code reviews are bad per se. They work well when untrusted authors try to collaborate together while working in isolation (f.e.: in open-source software development). They also work great when the team members live in different time zones. Of course, it can be still a valid question to ask why those teams have been assembled in that way. But when the authors are known to each other and form a team in a common virtual or physical environment, then using PR code reviews is a recipe for disaster in terms of code quality and lead time. Apart from the issues that PR reviews introduce productivity-killer context switches and result in inefficient communication patterns, the main problem with them is that they happen when it is already too late.

Most of the time the reviewers don't have enough knowledge to do it properly. We have the most recent domain knowledge of the code when we write it. In a PR review, the domain context is neither recent anymore, nor complete. In order to do a proper code review, the reviewer would need to understand the whole context behind the code, study all the requirements in deep detail, take part in the design sessions, and be aware of the domain knowledge and the conclusions that are gained during customer collaboration. We have to be honest, a reviewer never has all this knowledge while doing a review. The less we know, the less efficient code reviews are, which will lead to a higher risk of mistakes. One critical mistake can close the door of a company in modern web development. Since these “review requirements“ are not fulfilled by the reviewers - as they would take a significant time for them - the PR code reviews will end up being superficial. No inspection can find all the flaws. Additionally, as the number of lines to be reviewed increases, the quality of the review decreases.

The code reviews out there usually discuss code styling, some renamings here and there, and smaller refactorings. It is rare to see reviews where the overall code structure is deeply analyzed. They also miss discussions about the possible corner, edge, and missing test cases.

By the way, if you use mutation testing to find gaping holes in test suites during a code review, then you are my guest for a beer.

Not to mention the security concerns of the given application. If we don’t know the context deeply, then we can’t come up with great improvement points for the solution. If we don’t know the domain details, we don’t know how to abstract it. As always, the domain drives the design. Without it, one can only come up with improvement points that just scratch the surface of our application.

Testing later is not sufficient

You can not test your way to quality. Testing later tells nothing about how simple your code is. It won’t tell you if your code is readable or easy to improve. Tests will not show if you have unnecessary complexity in your code. Testing later is not only inappropriate but also inefficient. Testing later won’t guarantee that your code will be testable. Furthermore, it is impossible to trust our code where the tests are written afterward. Because we never see them failing. We never see them red. If we want to trust our code, we need to be able to trust our tests. If we want to trust our tests we need to be able to see them failing. If we never see our tests fail we don’t know if our tests are actually testing the right things. Seeing a test failing is as important as seeing it passing. A test failure validates that the test is meaningful and unique.

Just imagine now writing tests after the fact, you've probably already done this. How can you be assured that your tests are actually testing the right thing? How can you be sure that they don’t produce false negatives? False negatives are tests that do not fail (but they should) in a presence of a bug. They are invalid always passing tests, providing no value. For example a test with no or incorrect assertions. When we don’t start with a failing test, we need to take additional steps to be sure that our tests are testing the right thing. It leads to cumbersome workflows.

Furthermore, how can you be sure that you cover all your business logic with tests? If the answer is code coverage, then I have bad news for you. Code coverage is not a quality metric. You can have 100% test coverage without any meaningful tests. Just imagine you have a code having 100% code coverage. Cool. Now imagine deleting all your test assertions from your tests. Now run your code coverage tool again. The result? 100% code coverage. Do you still trust your tests? It is a hell of a misleading metric. Code coverage can help to detect untested areas of the code, but does not tell anything about the tested areas. One potential solution for this could be to use mutation testing. It could help you to find the missing tests in your test suite. But mutation testing is far from perfect, it is demotivating to use and requires additional effort. Again, using such a tool feels like duct-taping the code, instead of adding real quality to the product.

Refactoring later never happens

“We will refactor it later”. “We will clean it up later”. These are the biggest lies in our industry. Refactoring later never happens. Yet teams claim that they can add quality after the fact with refactoring. Even if it happens, it will include a lot of risks and will be very expensive. Likewise, the domain knowledge will be highly degraded after the fact, resulting in weak refactoring actions. We won’t be able to refactor the code as efficiently as we would have done while we were writing the code. How are we going to abstract some logic into a dedicated module half a year after the code is written? We will forget all the domain knowledge till then, making proper abstraction nearly impossible.

"Refactoring" is a word that should not be in any plan for making software. It is not something that you plan to do, but instead, it is something that you should always do. It's like washing your hands in the bathroom, you always do it when you need to, not only when it is planned. Refactoring should happen continuously, all the time, without asking any permission for it.

The solution

All of these practices (PR reviews, refactoring later, testing later) just build a false sense of security. They are flawed, inefficient, superficial, and lack a shared understanding of domain knowledge. On top of that, all the frustrations, waste and demotivation they bring to the team. We should not rely on these to improve quality. They are only duct-taping our software.

The process is deeply rotten. If the process is bad, the process needs to be improved. So when to add quality then? Quality should be built in from day one. Quality has to be built in, not tackled on afterward. Teams should start out with a high-quality system, then incrementally grow it with quality practices. Because once you have garbage in your system, it is very difficult to get rid of it. The only way to ensure quality is to build it during the development process. The only way to build quality during the development process is to keep refactoring, testing, and reviewing the software along the way. We need to do these every step of the way.

If we want to produce quality software, it has to be created with quality practices, in real-time, with quality-minded people. If you want to build quality during the fact, then use Test-Driven Development, Trunk Based Development, and other Extreme Programming practices such as Pair, and Mob Programming, with continuous collaboration with your end user. By doing TDD you will ensure that you will keep testing and refactoring your code, all the time. You will have all the necessary domain knowledge to do that, with trustworthy test suites you can rely on. Pair/Mob programming will make sure that your code is continuously reviewed, real-time with people having the right domain knowledge of the code to reason it about. These techniques will make PR reviews redundant, refactoring a daily habit, and tests are a core part of our design flow.

Raja Nagendra Kumar

Jan 28, 2023Edited

True, refactoring, unit testsing part of development, they are not process, plans to be managed by managers.

In fact PRs to be removed and let unit tests be made stonger as bugs get exposed. Team should review unit tests more than main code if any bug or regressions gets opened up.

Expand full comment

1 reply by Daniel Moka

José Enrique Estremadoyro fort

Apr 28, 2024

I love how your ideas seem controversial but are 100% on point, everything "modern" best practices try to achieve had already better ways of being done.

I'm happy to be able to read something from the real experts with the capability of discerning and with such a reinvigorating and authentic view of the software development process

2 more comments...

Craft Better Software

Discussion about this post