var blog = new ThoughtStream(me); RSS 2.0
 Tuesday, September 09, 2008

I've talked about code coverage and software quality with a number of people on Twitter in the last month or so, including Jimmy Bogard, who has a nicely written post about Quality and code coverage.

It's no secret that I have some fairly radical opinions on code coverage and that I don't agree with Jimmy in these regards. I firmly believe that 100% code coverage, in whatever form we can get, is a reasonable goal and not past the point of diminishing returns. As such, I'd like to respond to some of what Jimmy is saying and expand on it with my own thoughts.

But first - I don't like to draw hard lines between "Unit" tests and "Integration" tests - that just blurs the problem even more because we can argue that you are covering code or not, in "unit" tests all day long, based on this blurry line. This is not the discussion I want to have, so I am lumping "Unit" and "Integration" tests into "Unit" tests as a whole.

Secondly, I'd like to say that code coverage is not always an indication of quality - in poorly written systems, it's merely a measure of code coverage. However, quality code is another subject of much debate.

On to the response!

.............................................

"The idea is that the team, all practicing TDD, should dutifully measure and add unit tests until they reach the assumed pinnacle of unit testing: 100% coverage."

There are several assumptions and statements in this that I don't believe are true.

For one, TDD and Unit Testing are not the same - no where near the same. Unit Testing lets you add tests as you see fit - after you write the code, while you write the code, whenever you want. You can even do test-first unit testing. True TDD, however, is a pure pragmatic approach to software development in that you never write code that does not have proof of need to exist - a test that says it needs to be there. 

The "pinnacle" of TDD and Unit Testing are not the same thing. Unit Testing - the act of adding unit tests when you see the need - may have a goal of 100% coverage. TDD, on the other hand, has a goal of only implementing the code that has been specified as needed. 100% coverage is the default in TDD because you never write code you don't need.

If you are going through the "measure and add unit tests" process, I would put money down to say that you are doing Unit Testing and very little TDD. The process should be something more like "measure to make sure we haven't added anything we don't need yet". 100% coverage is a side effect of TDD.

"The general motivation behind 100% coverage is that 100% coverage equals zero bugs."

Unfortunately, this is the motivation for many people who write unit tests - but it's the wrong motivation. You will never be 100% bug free just because you have 100% code coverage. There will always be some business case or external system factors that cause a bug now and then. That doesn't mean we should be ok with letting bugs into our system. On the contrary, one of our motivations (not the only one) should be to prevent as many bugs as possible from getting into the system.

"NCover is a powerful tool, but it still doesn't support all types of coverage.  Attaining 100% coverage in NCover still means there are paths that we haven't tested yet, which means there are still potential bugs in our code. "

I think the logical conclusion of this argument is that if we don't have a tool that supports true 100% coverage, then we shouldn't use that tool at all. I know I'm inserting my own conclusions into Jimmy's words - that's simply the conclusion that I came to from these statements. And this is a very bad conclusion. Do you walk around naked simply because wearing clothes would get them dirty and you'd have to wash them? Both of these are akin to the broken window syndrome. Use the tools that are available to the extent that they can operate, and find better tools when you can.

"If 100% coverage is a goal,"

100% coverage should NEVER be a goal - it's merely an indication that you are on the right path and working pragmatically.

"In recent projects where we measured coverage several months in to the project, we saw regular numbers of 90% coverage.  This was on a team doing 100% TDD." ... "So are we doing TDD wrong?"

So, are you doing TDD wrong? Yes. You are unit testing in the guise of Test-First-Development and not adhering to the spirit or pragmatism that TDD wants.

"Every test introduced covered behavior we considered interesting.  If behavior isn't interesting, we don't care about it. "

If you have code that is not interesting and you don't care about it - why is it in the system? If you don't care about the code, then it makes no difference whether or not that code works correctly. I would be appalled to hear that this is true. If it is true, then that code should not be in the system to begin with.

"If tests are a description of the behavior of the system, why fill it with all the boring, trivial parts?  The effort required to cover triviality is just too high compared to other ways we can increase value."

I think your definition of "behavior" is wrong. From wordnet.princeton.edu/perl/webwn:

Behavior: "the action or reaction of something (as a machine or substance) under specified circumstances;"

So, how is throwing an exception when a parameter is null, not behavior? If you are doing defensive programming to make sure you have all the data you need, then you are creating behavior - system behavior, not business process behavior. Behavior is still behavior, though. The only possibility of code not being behavior is a simple data point. If you have a data point in your code that is not covered by a unit test of behavior, then you don't need that data point. Yes, yes, I know - "integration", "nhibernate", "not my data source", "it has to be there for ...". If you argue that you need a property for some external portion of the system to operate correctly, then you should be proving that you need the property by covering it with a unit test that specifically says what it is for and shows how it is used. If you don't, then there's no way of knowing why the property exists and it may be deleted because it looks like nothing in the system uses it. Or, possibly worse, you end up with a lot of dead code in your system because you are afraid of deleting anything.

"Missing in the 100% coverage conversation is the effort required to get to 100%.  Attempting to get another 5% takes equal effort of the previous 10%.  The next 2% takes equal effort of the previous 15%, and so on."

I'd like to know where you got the math. If you are talking about unit testing - even test-first unit testing - i can understand the pain you're talking about, here. Why should I bother setting up another test fixture and getting all the state of my objects in place for a simple null reference check? blech - that totally sucks, is boring, is tedious, etc.

However, my typical process of achieving 100% is to only write code that is needed for the system to work, by specifying how the system will work through tests and then coding it. If I have a null reference check somewhere in my code it should be because I already have a test that specifies why it's needed.

"This is called the law of diminishing returns.  As we get closer and closer to 100%, it takes vastly more effort to get there.  At some point, you have to ask yourselves, is there value in this effort? "

Here's the real crux of the problem - you are assuming that you started with, or at some point has less than 100% coverage. It takes zero additional effort to stay at 100% coverage if you start at 100% coverage and maintain it through pragmatic TDD.

"Often, bending code to get 100% can decrease design quality, as you're now twisting the original intent solely for coverage concerns, not usability, readability or other concerns."

If you find yourself bending code to get 100% coverage, you have one of two situations (if not both): 1) poor design in your code, 2) poorly written tests. I would bet that you have both because they are a vicious circle.

"Measuring coverage is an interesting data point, as are other measures such as static analysis.  But in the end, it's only a measure, an indication. "

100% agreed!

"It's still up to the team to decide on the value of addressing missing areas,"

This implies that you don't have 100% coverage, already. In this situation, 100% agreed. I'm in this situation right now - it's hard to know what tests create the most value, at this point. However, any new code - even code changes to existing code - are done TDD style. So, while we don't yet have 100% coverage, we are increasing coverage all the time by never writing code without a test (that's part of the goal, anyway... no one is perfect, and it takes serious discipline to do this)

"with the full knowledge that they are still limited to what the tool measures."

Agreed, again. If you don't know the limitations of the tools you are using, you're in trouble.

.............................................

The boilerplate argument that I am making is that you should never write code that is not required by a customer / consumer. However, as I've Previously Talked About, there's a high likelihood that you have not identified all of your customers / consumers. Take NHibernate, for example. I have a project that needs about 30 different properties for a specific NHibernate query to be executed. The NHibernate query is the only place they are ever used - no code reads them, only a search screen writes to them, then the NHibernate query loads data from the table that they are mapped to. When I first wrote this code, my coverage was at around 30%. This situation is one of many that I've been in recently, that has lead me to believe that other code must be considered a customer / consumer of your code. If I deleted any of these properties, then the NHibernate query would fail. Therefore, the properties are valuable to NHiberate. Since these properties are valuable to at least one of the consumers of my code, I should prove that they need to exist by writing a test that proves they are needed and why.

Tuesday, September 09, 2008 12:39:11 PM (Central Standard Time, UTC-06:00)  #    Comments [0]. Trackback 
Tags: Code Coverage | General | Lean Systems | Management | Philosophy of Software | Principles and Patterns | Refactoring | Test Driven Development | Unit Testing
Tracked by:
"Follow-up to my Re: Quality and code coverage" (ThoughtStream.Create(me);) [Trackback]

Navigation
About Me
View Derick Bailey's profile on LinkedIn

Send mail to the author(s) Contact Me
Archive
<September 2008>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011
About the author/Disclaimer

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2010
Derick Bailey
Sign In
Statistics
Total Posts: 115
This Year: 0
This Month: 0
This Week: 0
Comments: 44
Themes
Pick a theme:
All Content © 2010, Derick Bailey
DasBlog theme 'Business' created by Christoph De Baene (delarou)