Code Review: How to Convince a Skeptic

Published on September 29, 2014, by Brian Butler


Last weeks blog covered code review, how it makes your programming better, your products more stable, and accelerates your learning. But all that is a little wishy washy when facing a skeptic focussed on numbers and who wants to know why it's not full steam ahead. This excel sheet mentality tends to run headlong into the bigger picture vision and steamrolls right over it with it's reliance on numbers, however subjective the measurement criteria might be. In the world of business numbers triumph over logic every time. Writing 1,200 lines of code this week looks better on paper than 800 lines of code. But, done right, 800 lines of code may be 150% better, more stable, and more secure than the 1,200 lines.

But, how do you put quantifiable numbers behind your argument, and build a strong case for using a code review process in your development environment?

Well, luckily, there are some good studies out there that I have been reading for the last week, and I am about to share that information with you.

Findings from A Short History of the Cost Per Defect Metric

There is some good information in this paper regarding the cost per defect analysis of code, and the many ways in which the figures can be juiced to make the case for and against the implementation of such a system. In a review, the typical data for cost per defect varies from study to study but resembles the following pattern circa 2012:

  • More than 75% of the articles and books that use cost per defect metrics do not state explicitly whether preparation and executions costs are included or excluded
  • Cost per defect penalizes quality and is always cheapest where the greatest numbers of bugs are found
  • Even if calculated correctly, cost per defect does not measure the true economic value of improved software quality
  • There can be as much as 500% difference in apparent code size based on whether counts are physical or logical lines
  • Cost per defect has blinded the software industry to the true economic value of software and led to the false assumption that “high quality is expensive.” High quality for software is not expensive, and in fact is much cheaper and faster to develop high quality software than buggy software.

Thoughts concerning these costs and expense structures:

  1. The earlier you find bugs, the cheaper they are to fix
  2. If finding and fixing bugs is the biggest expense, then not creating them in the first place makes for major bottom line savings
  3. Measuring the number of lines of code is subjective and not standardized

Defect Removal Efficiency

Defect Removal Efficiency (DRE) is a simple metric used to calculate the % of defects removed from the code during development. In simple terms it is:

Number of defects removed + Number of defects found by customer in first 3 months = 100%

Using this metric it is easy to calculate the effectivness of your testing environment and how many bugs you are really catching before something goes to a user. In addition, the study conducted investigations into different test types and the usefulness of each one. I'm not going to get into that now as it is a huge section and I invite you to read it yourself if you need the full context, but here are some of the findings.

The following following test stages were analysed:

  • Unit Test
  • Function Test
  • Regression Test
  • Performance Test
  • System Test
  • Acceptance Test

Findings for these test cases were based on using the following metrics for analysis:

  • Writing test cases takes 16.5 hours for every test stage
  • Running test cases takes 9.9 hours for every test stage
  • Defect repair takes 5.0 hours for every defect found.

These figures may or may not resemble your own experience. But when using function points, instead of cost per bug, it is possible to actually study the economics of software quality in a fashion that matches standard economic assumptions, instead of distorting standard economic assumptions. This is why functional metrics are so powerful: they reveal real economic facts that are hidden by cost per defect, lines of code, and technical debt.

The findings from this investigation concluded that:

  • Defect repairs do not increase over time, they become cheaper over time
  • Projects with DRE below 85% will always run late, will always be over budget, and will never have happy customers
  • Projects with DRE above 95% will usually be on time, usually be under budget and usually have happy customers


Thoughts on these stats:

  1. Review hard, test hard, and reap the rewards seems to be the main one
  2. Apply well thought out logic to how you quantify what you do
  3. Skewed logic and reliance on economically unsound ideas reduces metrics to a game of politics

Findings from Best Kept Secrets of Code Review

While the previous paper made the case for code review, and how to quantify it correctly. This paper focusses on how to do code review right, and which methods get the best results. The following are direct and indirect benefits of code review:

Direct benefits

  • Improved code quality
  • Fewer defects in code
  • Improved communication about code content
  • Education of junior programmers

Indirect Benefits

  • Shorter development and test cycles
  • Reduced impact on technical support
  • More customer satisfaction
  • More maintainable code

Code Review Statistics:

  • The yield of the Code Review phase is 50 to 80% better than that of the Test phase
  • 94% of all reviews had a defect rate under 20 defects per hour regardless of review size
  • In one study 25% of the time was spent reading, 75% of the time in meetings, and yet 80% of the defects were found during reading. Reviewers were 12 times more efficient at finding defects by reading than by meeting
  • Meetings contributed only 4% of the defects found in the code inspections as a whole
  • Reviewers slower than 400 lines per hour were above average in their ability to uncover defects. But when faster than 450 lines/hour the defect density is below average in 87% of the cases.
  • Defects are found at relatively constant rates through the first 60 minutes of inspection. At that point the checklist-style review levels off sharply; the other review styles level off slightly later. In no case is a defect discovered after 90 minutes.
  • Reviews with author preparation have smaller defect densities compared to reviews without.
  • At Cisco, costs per customer support call were $33, and the company wanted to reduce the number of calls per year from 50,000. Code review was used both to remove defects and to improve usability. Over a few years the support calls were down to 20,000 even with a 2-fold increase in product sales, representing $2.6 million in savings

Thoughts on these stats:

  1. The best piece of advice is to review between 100 and 300 lines of code at a time and spend 30-60 minutes to review it
  2. Code review tools that allow for reading and collaboration would greatly benefit everyone involved
  3. Here comes the self-promotion - You can achieve all these benefits using RhodeCode Enterprise

Conclusion

Those are some statistics that can back up your claims as to the importance of code review, and how to implement it best within your system. Winning over skeptics is important, otherwise you will be held back by them, but luckily every skeptic can be turned into a believer if you learn how to speak their language and present information to them in a way that makes sense. Almost everything is subjective, but some issues such as number of bugs, shrinking profits, and missed deadlines can only be interpreted in one way. At least now you can use figures to win over the number aficionados so that if you need someone to value a certain method, then you can bring it to them in their language.

Brian