Part of the last project I did was to investigate ways that a legacy codebase could be rejuvenated. The legacy code had become bogged down in intermingled dependencies, bloated runtime, and a general fear of “if I change something here, what unexpected bugs will appear somewhere else?”.
The project was done in the context of Test Driven Development (TDD) and 100% unit test coverage of java code.
A tongue-in-cheek summary of 10 Reasons to Avoid Test Driven Development
TDD Defined
TDD is also called “Test First Development,” a practice where the test is written first, demonstrating that it fails, and then writing the core code until the test passes. The tests become part of an automated process during compilation to ensure that the code continues to perform to expectations.
If you test after the core code, you’ve probably already done some interactive testing, and now the testing feels like busy work, a common critique of unit testing in general.
TDD is especially relevant for bug fixing. If you can write a test that fails due to the bug, and then fix the code, you’ve guaranteed that the same bug won’t recur.
In practice, there’s some overlap in the sequencing. In java, your tests won’t even compile without the corresponding methods defined in the main code. Also, when doing exploratory work, you want to concentrate on getting things to work, not on how to drive a test. This is all fine, but the theme is for well known code to write your expected results first.
100% coverage defined
100% coverage means that every line of code and every route through a method (conditionals, loops, exceptions) is executed during the course of the test run.
It does not mean that every method of every class has a separate test method. The notion of writing separate methods for every setter and getter of a plain bean is obviously wasteful and clutters the test code with low-value tests. In practice, those methods are going to get called in the context of the logic tests.
However, every branch line is tested. This means a conditional like “if ( (A and B) OR (B and C) )” will take 4 tests. Exception catching must also be covered. If you’re going to bother catching an exception, you need to test that your executing that behavior properly.
What about something less that 100%, to avoid playing coverage games?
- To riff on John Wanamaker, “Half the time I spend writing unit tests is wasted; the trouble is I don’t know which half.”
- That means if you set a 90% goal, you might be missing 20% of the valuable use cases.
- It’s an easy smokescreen to set on checkin. If you allow non-testing of obvious code, then you need to review the non-coverage every time to make sure your metrics are covering the good stuff.
Why it’s worth it
As an investment in the future
A strong set of unit tests can form a safety net for ongoing changes to a codebase. When everything is executed by some form of test, you can be more fluid in code changes, knowing the code still performs the same same result when the tests pass.
Unit tests also function as requirements documentation. If tests and comments are written in a way that implies “this is what the system is supposed to do in case X,” then they will remind folks of those requirements if the test ever fails. This is especially important over the long run when we can’t count on the memory or presence of long-tenured developers.
Increasing quality
TDD fundamentally changes how you write code. Not just the sequence of putting characters in files, but in how the code is structured. Many best practices around code structure, such as method length and low cyclomatic complexity are actually victims of cargo cult programming. Those things are results of good TDD, not an end goal. For example, if you’re forced to cover all the code, you’re going to write small methods just as a way to avoid writing massive fixtures.
TDD also results better separation of concerns. A complaint about TDD is the time wasted setting up the preconditions to the test (the “fixture”), but this is actually an indicator that your method is doing too much. Out of laziness to avoid writing all those fixtures, developers will figure out ways to abstract out the dependencies and focus on the target code. Tighter code results.
TDD forces deeper thought into edge conditions, which are the bane of our runtimes. Rather than writing a function and thinking “this is the happy path, I’ll come back someday and flesh that out,” you actually write a test that exercises those rare occurrences, and codify the expected response.
Going to 100% coverage finds obscure bugs proactively. When you’re trying to nail that final percentage point of coverage, you sometimes get hung up and think “this should work, argh, why not?” Then, when you finally figure out why, it’ll be an epiphany: “That was a potentially nasty bug, and how would we ever have tracked that down in the complexity of a production runtime?”
As a productivity gain
No, really, even with all the silly tests to ensure 100% coverage and all the mocking and expectations it takes to write a valid fixture, you still get a net gain in productivity.
- executing a unit test class takes < 3 seconds so you get instant feedback on if it’s working.
- contrast that with the time it takes to spin up the full runtime (database, webapp container, etc). Many projects take 30-90s to spin up the app container, imagine having to do that multiple times while testing simple logic missteps.
- Stay in the zone. Those server startups are just the right amount of time for the developer to switch to another activity to fill the time, and now they get distracted by that extra email, twitter post, etc. Rather than resuming coding as soon as the app is running, they now need extra time to wrap up the distraction.
- the sum of well tested smaller bits of code means the bigger code comes together quicker. Even if your integration tests are manual, you’ll get them to pass in fewer tries when you’re putting together quality pieces, and not circling back to fix a forgotten null pointer exception.
Every QA issue avoided is a massive savings. If your code change has unexpected side effects elsewhere, the full test suite will point them out, and you can fix the other items in the same mental context. The alternative is to spin up the entire environment 24 hours later just to fix a few lines.
When it comes to bug fixing, once the cause of the bug is identified, it’s so much easier to zoom in on the fix with a unit test than it is to interactively test against the full application.
Implementation notes
A few miscellaneous bits worth noting from the project.
- This was all done in java and junit4.
- Only covered Java, not any UI or javascript
- Lines of code were almost 1:1 for core vs test.
- Coverage
- Measured using Emma for Eclipse. This should also be built into any continuous integration usages
- make sure to measure both lines and branches in the UI
- Only the main java gets to 100%. Unit tests won’t be marked as fully covered due to the way exceptions work.
- Need well documented naming, documentation, and file layout conventions for the tests. Otherwise it’s too hard to find the tests that cover a given chunk of logic.
- Refactoring tests will become an equal part of the maintenance. Don’t just keep piling on more tests – they need cleanup just the same to make sure they’re still relevant. Especially important when refactoring class/method names, make sure to fix the corresponding test names.
- Used H2 as a runtime database when testing DAO and persisted models. Do NOT deal with the dependency of mysql or oracle runtimes.
- Heavy use of mocking with EasyMock. The syntax can get verbose, and can be difficult to get the gist of the test
- avoided using Spring for injecting dependencies in the tests. Faster to run individual tests and avoids XML hell of managing lists of beans.
- Ruby kicks Java when it comes to TDD
- dynamic runtimes mean you write more test first
- far more succinct fixture setup
- rspec is such a clean way to declare expectations and get long term documentation from the result