After some additional Twitter discussion with Jimmy last night, I realized that my previous response had neglected to distinguish between two very important contexts - new code and legacy code. Considering a legacy system (as defined by Michael Feathers), I'm 100% on board with everything Jimmy is talking about and all of the techniques, troubles, and diminishing returns that he identifies. I still maintain that 100% coverage should be the default result of true TDD when writing new code - and I apply this philosophy when writing new code inside of a legacy system. Write a test to prove you need the new code that you are writing... unfortunately, the distinction between new and legacy code does get blurry and can be difficult to navigate when working in a legacy system. For more info on legacy code, in this context, and how to deal with it:
I've talked about code coverage and software quality with a number of people on Twitter in the last month or so, including Jimmy Bogard, who has a nicely written post about Quality and code coverage. It's no secret that I have some fairly radical opinions on code coverage and that I don't agree with Jimmy in these regards. I firmly believe that 100% code coverage, in whatever form we can get, is a reasonable goal and not past the point of diminishing returns. As such, I'd like to respond to some of what Jimmy is saying and expand on it with my own thoughts. But first - I don't like to draw hard lines between "Unit" tests and "Integration" tests - that just blurs the problem even more because we can argue that you are covering code or not, in "unit" tests all day long, based on this blurry line. This is not the discussion I want to have, so I am lumping "Unit" and "Integration" tests into "Unit" tests as a whole. Secondly, I'd like to say that code coverage is not always an indication of quality - in poorly written systems, it's merely a measure of code coverage. However, quality code is another subject of much debate. On to the response! ............................................. "The idea is that the team, all practicing TDD, should dutifully measure and add unit tests until they reach the assumed pinnacle of unit testing: 100% coverage." There are several assumptions and statements in this that I don't believe are true. For one, TDD and Unit Testing are not the same - no where near the same. Unit Testing lets you add tests as you see fit - after you write the code, while you write the code, whenever you want. You can even do test-first unit testing. True TDD, however, is a pure pragmatic approach to software development in that you never write code that does not have proof of need to exist - a test that says it needs to be there. The "pinnacle" of TDD and Unit Testing are not the same thing. Unit Testing - the act of adding unit tests when you see the need - may have a goal of 100% coverage. TDD, on the other hand, has a goal of only implementing the code that has been specified as needed. 100% coverage is the default in TDD because you never write code you don't need. If you are going through the "measure and add unit tests" process, I would put money down to say that you are doing Unit Testing and very little TDD. The process should be something more like "measure to make sure we haven't added anything we don't need yet". 100% coverage is a side effect of TDD. "The general motivation behind 100% coverage is that 100% coverage equals zero bugs." Unfortunately, this is the motivation for many people who write unit tests - but it's the wrong motivation. You will never be 100% bug free just because you have 100% code coverage. There will always be some business case or external system factors that cause a bug now and then. That doesn't mean we should be ok with letting bugs into our system. On the contrary, one of our motivations (not the only one) should be to prevent as many bugs as possible from getting into the system. "NCover is a powerful tool, but it still doesn't support all types of coverage. Attaining 100% coverage in NCover still means there are paths that we haven't tested yet, which means there are still potential bugs in our code. " I think the logical conclusion of this argument is that if we don't have a tool that supports true 100% coverage, then we shouldn't use that tool at all. I know I'm inserting my own conclusions into Jimmy's words - that's simply the conclusion that I came to from these statements. And this is a very bad conclusion. Do you walk around naked simply because wearing clothes would get them dirty and you'd have to wash them? Both of these are akin to the broken window syndrome. Use the tools that are available to the extent that they can operate, and find better tools when you can. "If 100% coverage is a goal," 100% coverage should NEVER be a goal - it's merely an indication that you are on the right path and working pragmatically. "In recent projects where we measured coverage several months in to the project, we saw regular numbers of 90% coverage. This was on a team doing 100% TDD." ... "So are we doing TDD wrong?" So, are you doing TDD wrong? Yes. You are unit testing in the guise of Test-First-Development and not adhering to the spirit or pragmatism that TDD wants. "Every test introduced covered behavior we considered interesting. If behavior isn't interesting, we don't care about it. " If you have code that is not interesting and you don't care about it - why is it in the system? If you don't care about the code, then it makes no difference whether or not that code works correctly. I would be appalled to hear that this is true. If it is true, then that code should not be in the system to begin with. "If tests are a description of the behavior of the system, why fill it with all the boring, trivial parts? The effort required to cover triviality is just too high compared to other ways we can increase value." I think your definition of "behavior" is wrong. From wordnet.princeton.edu/perl/webwn: Behavior: "the action or reaction of something (as a machine or substance) under specified circumstances;" So, how is throwing an exception when a parameter is null, not behavior? If you are doing defensive programming to make sure you have all the data you need, then you are creating behavior - system behavior, not business process behavior. Behavior is still behavior, though. The only possibility of code not being behavior is a simple data point. If you have a data point in your code that is not covered by a unit test of behavior, then you don't need that data point. Yes, yes, I know - "integration", "nhibernate", "not my data source", "it has to be there for ...". If you argue that you need a property for some external portion of the system to operate correctly, then you should be proving that you need the property by covering it with a unit test that specifically says what it is for and shows how it is used. If you don't, then there's no way of knowing why the property exists and it may be deleted because it looks like nothing in the system uses it. Or, possibly worse, you end up with a lot of dead code in your system because you are afraid of deleting anything. "Missing in the 100% coverage conversation is the effort required to get to 100%. Attempting to get another 5% takes equal effort of the previous 10%. The next 2% takes equal effort of the previous 15%, and so on." I'd like to know where you got the math. If you are talking about unit testing - even test-first unit testing - i can understand the pain you're talking about, here. Why should I bother setting up another test fixture and getting all the state of my objects in place for a simple null reference check? blech - that totally sucks, is boring, is tedious, etc. However, my typical process of achieving 100% is to only write code that is needed for the system to work, by specifying how the system will work through tests and then coding it. If I have a null reference check somewhere in my code it should be because I already have a test that specifies why it's needed. "This is called the law of diminishing returns. As we get closer and closer to 100%, it takes vastly more effort to get there. At some point, you have to ask yourselves, is there value in this effort? " Here's the real crux of the problem - you are assuming that you started with, or at some point has less than 100% coverage. It takes zero additional effort to stay at 100% coverage if you start at 100% coverage and maintain it through pragmatic TDD. "Often, bending code to get 100% can decrease design quality, as you're now twisting the original intent solely for coverage concerns, not usability, readability or other concerns." If you find yourself bending code to get 100% coverage, you have one of two situations (if not both): 1) poor design in your code, 2) poorly written tests. I would bet that you have both because they are a vicious circle. "Measuring coverage is an interesting data point, as are other measures such as static analysis. But in the end, it's only a measure, an indication. " 100% agreed! "It's still up to the team to decide on the value of addressing missing areas," This implies that you don't have 100% coverage, already. In this situation, 100% agreed. I'm in this situation right now - it's hard to know what tests create the most value, at this point. However, any new code - even code changes to existing code - are done TDD style. So, while we don't yet have 100% coverage, we are increasing coverage all the time by never writing code without a test (that's part of the goal, anyway... no one is perfect, and it takes serious discipline to do this) "with the full knowledge that they are still limited to what the tool measures." Agreed, again. If you don't know the limitations of the tools you are using, you're in trouble. ............................................. The boilerplate argument that I am making is that you should never write code that is not required by a customer / consumer. However, as I've Previously Talked About, there's a high likelihood that you have not identified all of your customers / consumers. Take NHibernate, for example. I have a project that needs about 30 different properties for a specific NHibernate query to be executed. The NHibernate query is the only place they are ever used - no code reads them, only a search screen writes to them, then the NHibernate query loads data from the table that they are mapped to. When I first wrote this code, my coverage was at around 30%. This situation is one of many that I've been in recently, that has lead me to believe that other code must be considered a customer / consumer of your code. If I deleted any of these properties, then the NHibernate query would fail. Therefore, the properties are valuable to NHiberate. Since these properties are valuable to at least one of the consumers of my code, I should prove that they need to exist by writing a test that proves they are needed and why.
I gave a presentation to my team on S.O.L.I.D. software development principles, talking about how these five principles enable us to achieve High Cohesion, Low Coupling, and proper Encapsulation in our software projects. If you're unfamiliar with the SOLID principles, they are: - SRP: Single Responsibility - One reason to exist, one reason to change
- OCP: Open Closed Principle - Open for extension, closed for modification
- LSP: Liskov Substitution Principle - An object should be semantically replaceable for it's base class/interface
- ISP: Interface Segregation Principle - Don't force a client to depend on an interface it doesn't need to know about
- DIP: Dependency Inversion Principle - Depend on abstractions, not concrete detail or implementations
You can find a lot of good info on SOLID via Robert Martin (the man who coined the acronym) and the Los Techies crew: The final slide in my presentation shows the before and after of migrating a small app that reads a file and sends an email, from being 100% coded in the Winforms code, to a SOLID code structure. During the summary / review, I related the five SOLID principles to High Cohesion, Low Coupling, and Encapsulation through these descriptions: Low Coupling: By abstracting many of our implementation needs into various interfaces and introducing the concepts on OCP and DIP, we’ve created a system that has very low coupling. Many of these individual pieces can be taken out of this system with little to no spaghetti mess trailing after it. Separating the various concerns into the various object implementations has also helped us ensure that we can change the system’s behavior as needed, with very little modification to the overall system – just update the one piece that contains the behavior in question. High Cohesion: This really is a direct result of low coupling and SRP – we have a lot of small pieces that can be stacked together like building blocks to create something larger and more complex. Any of these individual pieces may not represent much functionality or behavior, but then, an individual piece isn’t much fun to use without a bunch of other pieces. DIP has also allowed us to tie the various blocks together by depending on an abstraction and allowing that abstraction to be fulfilled with different implementations, creating a system that is much greater than the mere sum of it’s parts. Encapsulation: True encapsulation is not just making fields private and hiding data from external objects – but hiding implementation details from other objects, depending only on the abstractions and expected behaviors of those abstractions. LSP, DI, and SRP all work hand in hand to create true encapsulation in the new project structure. We’ve encapsulated our behavioral implementations in many individual objects, preventing them from leaking into each other – and we’ve ensured that the dependency on those behaviors is encapsulated behind a known interface. We’ve hidden the implementation details and allowed for any implementation to be put in place for that interface definition through DIP. At the same time, we’ve done the necessary due-diligence to ensure that we are not violating any of the individual abstraction’s semantics or purpose (LSP), ensuring that we can properly replace the implementation as needed. The end result of our effort has created a system that appears to be complex on the surface – after all, there’s now 13 blocks in our diagram compared to the original 2. However, the apparent complexity of this system is diminished by the simplicity of each individual interface and object. Many small, independent, simple pieces have been wired together to create a larger overall system behavior that can be described as complex. Yet any of these implementations can be changed and/or replaced – very easily. At some point, I'm hoping to post the entire slide deck and presentation / code, but for now I thought the summary that I sent to the team was worth sharing with the rest of the world.
Dan North (father of BDD) responded to some dialog over on the Google BDD group with a very insightful look at what the "Context" of a specification really is. For me, this was an eye opening post and I'm already seeing ways to improve my specification tests. His post is worth quoting in it's entirety: Let me describe where the idea of contexts ("givens") came from. We started out by writing each story on the front of an index card, and on the back we'd draw a line down the middle to create two columns. Then we would label the columns: "I do this" and "This happens". (I think it was Ivan Moore who first showed me this.) It's incredibly simple and it worked well for describing the acceptance tests. I do Y, and Z should happen. If it doesn't I'm not done yet. Once it does I can go to the pub! Then I showed that to my business analyst friend Chris Matts, and he said: that doesn't make sense. I do Y and *anything* could happen! I request cash from an ATM and it could give me cash. Or it could refuse because I'm overdrawn. Or it could retain the card and call the police! You're missing a context. So we evolved it into *Given X*, When Y, Then Z. So now "the context" is simply another way of saying "which scenario is this?". It's the scenario where my account is in credit, or the one where I'm overdrawn, or the one where the card was reported as stolen. Like on Friends where each episode is called "The one where...". This means the scenario titles are typically just a description of the context, and the givens set up that context. In other words, you discover the contexts as you describe which scenarios you are interested in (and more importantly which ones you aren't). Cheers, Dan Solid gold, Dan! and thanks for the insight!
Earlier today, I had a conversation with a coworker concerning dependency container, dependency injection frameworks, and the root dependency inversion principle. My advice in the end, was to completely avoid the use of DI tools until the team as a whole understands the cost, benefits, and potential pains of manual dependency injection (pain being relative, and usually a sign of a learning opportunity). Part of the conversation also revolved around what constitutes the complete overuse and abuse of DI or IoC tool - which I can easily speak about due to extensive personal, self-inflicted, love-affair-of-IoC induced pain over the last year. However, the one thing I could not speak to was the correct use of a DI / IoC tool, because I believe that I have never used one correctly - or at least, my limited experience in using one correctly is so limited, I can't seem to separate it from the incorrect use. I’ve heard other developers (Jeremy miller, jimmy bogard, and others) say that they want to see as little of a dependency container / injection tool in their code as possible. This gets to the heart of what I was trying to convey earlier, but I’ve never had a good understanding of where you would actually allow a dependency tool to be used under those guidelines. On the way home from work tonight, I had (what I think is) a small epiphany around the idea that the dependency container should be limited to two key areas: the various implementation specifics (UI, database, etc) and the application layer. I think the use in the implementation specifics, for whatever use they're needed, is valid since these implementations are not unit tested to begin with. But I would highly recommend limiting the container’s use in these scenarios, for the same over used, abused reasons that I'm so intimate with already. More appropriately, the application layer seems to be the appropriate location to resolve the dependencies by using the DI tool to instantiate the object that needs the dependencies, automatically resolving the dependencies for us – not requesting the dependency directly. The best example I can think of, off-hand, is a workflow coordination service in the application layer. Let’s say your workflow moves from FormA to FormB in a windows app. The workflow class would use the DI tool to instantiate the ProcessAPresenter, which would resolve the registered IProcessAView as a constructor dependency. Then when this form is done, the workflow coordination class would use the DI tool again to instantiate the ProcessBPresenter and resolve the IProcessBView (and ISecurityService and whatever else) constructor dependencies. The key here is that we are allowing the application layer to use the DI tool, and not the other way around – not letting the DI tool instantiate the application layer - and not using the tool as a simple IoC container to resolve dependencies internally from the object that needs the dependency. These are primitive, unverified thoughts at this point, and need to be taken as such. I think this is a good start for a correct use of a DI tool, though. Additional implementation experience within this model would help to expose additional constraints and allowed uses, I would imagine. How are you using your DI / IoC tools? What are your thoughts on the subject? Your pains, your joys, your sorrows? And from a purely selfish perspective - am I on the right track, here?
There's been a lot of recent talk about what is "done" in the Lean / Agile development communities, and the primary focus of these discussions has been focused on individual features or stories and the need to get them completely "done" (dev, test, acceptance test, documentation, delivered to customer) before they can be considered done. I wholeheartedly agree with this philosophy, to the point where I active introduce painful elements of software development to my team members, because the work must be done before we can deliver to the customer. All that being said - I think we're missing out on some potential benefit by not applying "done" at various and different levels of our software development projects. For example, the concept of "done" can be applied to an iteration. What constitutes an iteration being "done"? It's certainly not just the time-box of 2 weeks, 1 month, or whatever our iteration length is defined as. However, that time box is still important. We don't want arbitrary lengths of iterations. So how do we know when an iteration is completely "done" versus just being over? My initial thoughts revolve around a pass/fail checklist, similar to the swim lanes (or kanban board, if you want to call it that) that our stories go through. At this point, I would likely include the following: - Iteration length passed
- All stories in iteration "done"
- Code reviewed (possibly part of story being "done")
- Software is in a stable, working state
- Software acceptance tested by customer (or customer representative / SME)
- Retrospective held
There's probably some additional items to include here - this is just my initial idea list. Like the user story "done" criteria, an iteration cannot be considered "done" until all of the items in this list have been checked off. We don't give partial credit for a story making it to "in test", when there are 2 more columns to move through - the story is not done until it's completely "done". The same should apply to the iteration. Why bother making an official "done" list for iterations? For the same simple reason we do this for user stories - transparency and visibility into bottlenecks and problems. It's easy for a team to ignore problems like broken software at the end of an iteration - "oh, it's just XYZ... i'll take care of that tomorrow" - when we don't have the same accountability as we do in our stories. By making the iteration a pass/fail set, we expose those problems for the world to see. This exposure is a great motivational tool - who wants to be caught with an un-"done" iteration because of a "simple" bug? Do you and your team have an iteration "done" list? Do you even have a user story "done" list? It's all about exposing weakness and waste, then eliminating it and creating the official iteration "done" list is just one more step along the path. Don't stop there. What other processes - higher or lower level - can we standardize and improve?
The one and only goal in software development is to provide useful, functional software that creates value for the customer or consumer. Period. End of story. Not open for discussion or opinions. If you want to deliver software that the customer or consumer wants to use, likes to use, and ultimately does use to help solve problems and/or automate business process, etc, then you must know what the customer or consumer values. In other words - you can't identify your software's 'value' if you don't know who/what ALL of your customers are. I'd put money down to say that most software developers - perhaps most software development companies - have no clue how many customers and consumers there really are for any given software project. The one consistent answer that I would expect from every developer or company is that the person, group, or company paying money for the software is the customer. This is 100%, absolutely true. In addition to the paying customer, though, there are many other customers or consumers that need to be identified. As a starter, consider the following list for the customers / consumers of your software: - The paying customer
- Integrated external systems
- API consuming systems
- Software Testers and Test Lab personnel
- Technical and documentation writers
- Software developers (the ones writing the code!)
I know, I know... how could a software developer possibly be considered a customer or consumer of the software that they are writing? The answer should be obvious - who reads and writes the code? who maintains the code and needs to understand how the code is structured so it can be maintained? If you don't believe that your software developers are first class consumers of the system that they are writing, I would bet that you have a horribly complex kludge of code that no one wants to work on. If your developers are considered first class consumers of the code they are writing, I would bet that your team is happy and is constantly working toward better code - simple, readable, maintainable systems that are fun to work on. So, who are your customers or consumers? What value do they need in your system and from your system? And, how are you and your team responding to those value needs (if at all)?
Derick’s (brand new, just thought of it, but is now elevated to ‘mantra’ status for me) golden rule of Acceptance Criteria: If it’s not usable by every team member, it’s not Acceptance Criteria. And I do mean every team member - Customers, Testers, Tech Writers, BAs, Developers, UX Designers, and anyone else on your team. You can specifying the technical or UI details in the story’s detail, but it’s not acceptance criteria – it’s Technical Criteria, or UI Criteria, or Test Automation Criteria, or … etc. Generally speaking, don't include these alternate criteria in the story detail – let the specific team members determine their specific criteria and record it how they need to (through unit test for devs, interaction design mockups for ui peeps, etc).
The entire team – BAs, Devs, UX, Testers, Tech Writers, Customers and Management (especially customers and management) – all need to understand that the first few iterations of a team that is trying to convert to an agile methodology, will be slow and painful. You won’t get as much done as you think you should. You’ll run into constant problems, questions, points of clarification, and unknowns. Iterative development processes expose every single problem that your team never knew it had, and more – and if you’re honest and working with integrity, you won’t whitewash the problems and try to hide them; rather, you’ll embrace the transparency that iterations create and use it to drive improvement in your team. Converting to Agile is painful and it’s embarrassing at time. There's no question that it’s difficult to convert – especially if the team has any significant habits or experience from previous projects (and who doesn’t, aside from entry level people). Work through the pain, solve the problems one at a time and continue to improve the process. Eventually, the team will find it’s groove and productivity will increase. Over time, if the team truly is honest with itself and is continuously solving problem, productivity should skyrocket and you’ll wonder how you ever worked without iterative / agile methodologies. The key is - don’t expect it to be perfect… ever… take one step at a time and never stop moving forward.
|