
- •Practical Unit Testing with JUnit and Mockito
- •Table of Contents
- •About the Author
- •Acknowledgments
- •Preface
- •Preface - JUnit
- •Part I. Developers' Tests
- •Chapter 1. On Tests and Tools
- •1.1. An Object-Oriented System
- •1.2. Types of Developers' Tests
- •1.2.1. Unit Tests
- •1.2.2. Integration Tests
- •1.2.3. End-to-End Tests
- •1.2.4. Examples
- •1.2.5. Conclusions
- •1.3. Verification and Design
- •1.5. Tools Introduction
- •Chapter 2. Unit Tests
- •2.1. What is a Unit Test?
- •2.2. Interactions in Unit Tests
- •2.2.1. State vs. Interaction Testing
- •2.2.2. Why Worry about Indirect Interactions?
- •Part II. Writing Unit Tests
- •3.2. Class To Test
- •3.3. Your First JUnit Test
- •3.3.1. Test Results
- •3.4. JUnit Assertions
- •3.5. Failing Test
- •3.6. Parameterized Tests
- •3.6.1. The Problem
- •3.6.2. The Solution
- •3.6.3. Conclusions
- •3.7. Checking Expected Exceptions
- •3.8. Test Fixture Setting
- •3.8.1. Test Fixture Examples
- •3.8.2. Test Fixture in Every Test Method
- •3.8.3. JUnit Execution Model
- •3.8.4. Annotations for Test Fixture Creation
- •3.9. Phases of a Unit Test
- •3.10. Conclusions
- •3.11. Exercises
- •3.11.1. JUnit Run
- •3.11.2. String Reverse
- •3.11.3. HashMap
- •3.11.4. Fahrenheits to Celcius with Parameterized Tests
- •3.11.5. Master Your IDE
- •Templates
- •Quick Navigation
- •Chapter 4. Test Driven Development
- •4.1. When to Write Tests?
- •4.1.1. Test Last (AKA Code First) Development
- •4.1.2. Test First Development
- •4.1.3. Always after a Bug is Found
- •4.2. TDD Rhythm
- •4.2.1. RED - Write a Test that Fails
- •How To Choose the Next Test To Write
- •Readable Assertion Message
- •4.2.2. GREEN - Write the Simplest Thing that Works
- •4.2.3. REFACTOR - Improve the Code
- •Refactoring the Tests
- •Adding Javadocs
- •4.2.4. Here We Go Again
- •4.3. Benefits
- •4.4. TDD is Not Only about Unit Tests
- •4.5. Test First Example
- •4.5.1. The Problem
- •4.5.2. RED - Write a Failing Test
- •4.5.3. GREEN - Fix the Code
- •4.5.4. REFACTOR - Even If Only a Little Bit
- •4.5.5. First Cycle Finished
- •‘The Simplest Thing that Works’ Revisited
- •4.5.6. More Test Cases
- •But is It Comparable?
- •Comparison Tests
- •4.6. Conclusions and Comments
- •4.7. How to Start Coding TDD
- •4.8. When not To Use Test-First?
- •4.9. Should I Follow It Blindly?
- •4.9.1. Write Good Assertion Messages from the Beginning
- •4.9.2. If the Test Passes "By Default"
- •4.10. Exercises
- •4.10.1. Password Validator
- •4.10.2. Regex
- •4.10.3. Booking System
- •Chapter 5. Mocks, Stubs, Test Spies
- •5.1. Introducing Mockito
- •5.1.1. Creating Test Doubles
- •5.1.2. Expectations
- •5.1.3. Verification
- •5.1.4. Conclusions
- •5.2. Types of Test Double
- •5.2.1. Code To Be Tested with Test Doubles
- •5.2.2. The Dummy Object
- •5.2.3. Test Stub
- •5.2.4. Test Spy
- •5.2.5. Mock
- •5.3. Putting it All Together
- •5.4. Example: TDD with Test Doubles
- •5.4.2. The Second Test: Send a Message to Multiple Subscribers
- •Refactoring
- •5.4.3. The Third Test: Send Messages to Subscribers Only
- •5.4.4. The Fourth Test: Subscribe More Than Once
- •Mockito: How Many Times?
- •5.4.5. The Fifth Test: Remove a Subscriber
- •5.4.6. TDD and Test Doubles - Conclusions
- •More Test Code than Production Code
- •The Interface is What Really Matters
- •Interactions Can Be Tested
- •Some Test Doubles are More Useful than Others
- •5.5. Always Use Test Doubles… or Maybe Not?
- •5.5.1. No Test Doubles
- •5.5.2. Using Test Doubles
- •No Winner So Far
- •5.5.3. A More Complicated Example
- •5.5.4. Use Test Doubles or Not? - Conclusion
- •5.6. Conclusions (with a Warning)
- •5.7. Exercises
- •5.7.1. User Service Tested
- •5.7.2. Race Results Enhanced
- •5.7.3. Booking System Revisited
- •5.7.4. Read, Read, Read!
- •Part III. Hints and Discussions
- •Chapter 6. Things You Should Know
- •6.1. What Values To Check?
- •6.1.1. Expected Values
- •6.1.2. Boundary Values
- •6.1.3. Strange Values
- •6.1.4. Should You Always Care?
- •6.1.5. Not Only Input Parameters
- •6.2. How to Fail a Test?
- •6.3. How to Ignore a Test?
- •6.4. More about Expected Exceptions
- •6.4.1. The Expected Exception Message
- •6.4.2. Catch-Exception Library
- •6.4.3. Testing Exceptions And Interactions
- •6.4.4. Conclusions
- •6.5. Stubbing Void Methods
- •6.6. Matchers
- •6.6.1. JUnit Support for Matcher Libraries
- •6.6.2. Comparing Matcher with "Standard" Assertions
- •6.6.3. Custom Matchers
- •6.6.4. Advantages of Matchers
- •6.7. Mockito Matchers
- •6.7.1. Hamcrest Matchers Integration
- •6.7.2. Matchers Warning
- •6.8. Rules
- •6.8.1. Using Rules
- •6.8.2. Writing Custom Rules
- •6.9. Unit Testing Asynchronous Code
- •6.9.1. Waiting for the Asynchronous Task to Finish
- •6.9.2. Making Asynchronous Synchronous
- •6.9.3. Conclusions
- •6.10. Testing Thread Safe
- •6.10.1. ID Generator: Requirements
- •6.10.2. ID Generator: First Implementation
- •6.10.3. ID Generator: Second Implementation
- •6.10.4. Conclusions
- •6.11. Time is not on Your Side
- •6.11.1. Test Every Date (Within Reason)
- •6.11.2. Conclusions
- •6.12. Testing Collections
- •6.12.1. The TDD Approach - Step by Step
- •6.12.2. Using External Assertions
- •Unitils
- •Testing Collections Using Matchers
- •6.12.3. Custom Solution
- •6.12.4. Conclusions
- •6.13. Reading Test Data From Files
- •6.13.1. CSV Files
- •6.13.2. Excel Files
- •6.14. Conclusions
- •6.15. Exercises
- •6.15.1. Design Test Cases: State Testing
- •6.15.2. Design Test Cases: Interactions Testing
- •6.15.3. Test Collections
- •6.15.4. Time Testing
- •6.15.5. Redesign of the TimeProvider class
- •6.15.6. Write a Custom Matcher
- •6.15.7. Preserve System Properties During Tests
- •6.15.8. Enhance the RetryTestRule
- •6.15.9. Make an ID Generator Bulletproof
- •Chapter 7. Points of Controversy
- •7.1. Access Modifiers
- •7.2. Random Values in Tests
- •7.2.1. Random Object Properties
- •7.2.2. Generating Multiple Test Cases
- •7.2.3. Conclusions
- •7.3. Is Set-up the Right Thing for You?
- •7.4. How Many Assertions per Test Method?
- •7.4.1. Code Example
- •7.4.2. Pros and Cons
- •7.4.3. Conclusions
- •7.5. Private Methods Testing
- •7.5.1. Verification vs. Design - Revisited
- •7.5.2. Options We Have
- •7.5.3. Private Methods Testing - Techniques
- •Reflection
- •Access Modifiers
- •7.5.4. Conclusions
- •7.6. New Operator
- •7.6.1. PowerMock to the Rescue
- •7.6.2. Redesign and Inject
- •7.6.3. Refactor and Subclass
- •7.6.4. Partial Mocking
- •7.6.5. Conclusions
- •7.7. Capturing Arguments to Collaborators
- •7.8. Conclusions
- •7.9. Exercises
- •7.9.1. Testing Legacy Code
- •Part IV. Listen and Organize
- •Chapter 8. Getting Feedback
- •8.1. IDE Feedback
- •8.1.1. Eclipse Test Reports
- •8.1.2. IntelliJ IDEA Test Reports
- •8.1.3. Conclusion
- •8.2. JUnit Default Reports
- •8.3. Writing Custom Listeners
- •8.4. Readable Assertion Messages
- •8.4.1. Add a Custom Assertion Message
- •8.4.2. Implement the toString() Method
- •8.4.3. Use the Right Assertion Method
- •8.5. Logging in Tests
- •8.6. Debugging Tests
- •8.7. Notifying The Team
- •8.8. Conclusions
- •8.9. Exercises
- •8.9.1. Study Test Output
- •8.9.2. Enhance the Custom Rule
- •8.9.3. Custom Test Listener
- •8.9.4. Debugging Session
- •Chapter 9. Organization Of Tests
- •9.1. Package for Test Classes
- •9.2. Name Your Tests Consistently
- •9.2.1. Test Class Names
- •Splitting Up Long Test Classes
- •Test Class Per Feature
- •9.2.2. Test Method Names
- •9.2.3. Naming of Test-Double Variables
- •9.3. Comments in Tests
- •9.4. BDD: ‘Given’, ‘When’, ‘Then’
- •9.4.1. Testing BDD-Style
- •9.4.2. Mockito BDD-Style
- •9.5. Reducing Boilerplate Code
- •9.5.1. One-Liner Stubs
- •9.5.2. Mockito Annotations
- •9.6. Creating Complex Objects
- •9.6.1. Mummy Knows Best
- •9.6.2. Test Data Builder
- •9.6.3. Conclusions
- •9.7. Conclusions
- •9.8. Exercises
- •9.8.1. Test Fixture Setting
- •9.8.2. Test Data Builder
- •Part V. Make Them Better
- •Chapter 10. Maintainable Tests
- •10.1. Test Behaviour, not Methods
- •10.2. Complexity Leads to Bugs
- •10.3. Follow the Rules or Suffer
- •10.3.1. Real Life is Object-Oriented
- •10.3.2. The Non-Object-Oriented Approach
- •Do We Need Mocks?
- •10.3.3. The Object-Oriented Approach
- •10.3.4. How To Deal with Procedural Code?
- •10.3.5. Conclusions
- •10.4. Rewriting Tests when the Code Changes
- •10.4.1. Avoid Overspecified Tests
- •10.4.2. Are You Really Coding Test-First?
- •10.4.3. Conclusions
- •10.5. Things Too Simple To Break
- •10.6. Conclusions
- •10.7. Exercises
- •10.7.1. A Car is a Sports Car if …
- •10.7.2. Stack Test
- •Chapter 11. Test Quality
- •11.1. An Overview
- •11.2. Static Analysis Tools
- •11.3. Code Coverage
- •11.3.1. Line and Branch Coverage
- •11.3.2. Code Coverage Reports
- •11.3.3. The Devil is in the Details
- •11.3.4. How Much Code Coverage is Good Enough?
- •11.3.5. Conclusion
- •11.4. Mutation Testing
- •11.4.1. How does it Work?
- •11.4.2. Working with PIT
- •11.4.3. Conclusions
- •11.5. Code Reviews
- •11.5.1. A Three-Minute Test Code Review
- •Size Heuristics
- •But do They Run?
- •Check Code Coverage
- •Conclusions
- •11.5.2. Things to Look For
- •Easy to Understand
- •Documented
- •Are All the Important Scenarios Verified?
- •Run Them
- •Date Testing
- •11.5.3. Conclusions
- •11.6. Refactor Your Tests
- •11.6.1. Use Meaningful Names - Everywhere
- •11.6.2. Make It Understandable at a Glance
- •11.6.3. Make Irrelevant Data Clearly Visible
- •11.6.4. Do not Test Many Things at Once
- •11.6.5. Change Order of Methods
- •11.7. Conclusions
- •11.8. Exercises
- •11.8.1. Clean this Mess
- •Appendix A. Automated Tests
- •A.1. Wasting Your Time by not Writing Tests
- •A.1.1. And what about Human Testers?
- •A.1.2. One More Benefit: A Documentation that is Always Up-To-Date
- •A.2. When and Where Should Tests Run?
- •Appendix B. Running Unit Tests
- •B.1. Running Tests with Eclipse
- •B.1.1. Debugging Tests with Eclipse
- •B.2. Running Tests with IntelliJ IDEA
- •B.2.1. Debugging Tests with IntelliJ IDEA
- •B.3. Running Tests with Gradle
- •B.3.1. Using JUnit Listeners with Gradle
- •B.3.2. Adding JARs to Gradle’s Tests Classpath
- •B.4. Running Tests with Maven
- •B.4.1. Using JUnit Listeners and Reporters with Maven
- •B.4.2. Adding JARs to Maven’s Tests Classpath
- •Appendix C. Test Spy vs. Mock
- •C.1. Different Flow - and Who Asserts?
- •C.2. Stop with the First Error
- •C.3. Stubbing
- •C.4. Forgiveness
- •C.5. Different Threads or Containers
- •C.6. Conclusions
- •Appendix D. Where Should I Go Now?
- •Bibliography
- •Glossary
- •Index
- •Thank You!

Chapter 11. Test Quality
Check Code Coverage
As we noted in Section 11.3, code coverage can inform us about areas of code that are being undertested. Three minutes should be enough time to run the build script and have a look at the code coverage report. What you are looking to find, in such a short period of time, are white areas of untested code. There is no time to ponder over each line, but it is definitely possible to see that some package has 20% code coverage, while the rest is close to 80%20.
…and if there is no build script which would allow you to generate a code coverage report? Well, then you have one more issue to report to your colleagues.
Conclusions
Three minutes will not allow you to perform a real test code review, but it is enough to uncover some major issues with the test code. If this is all you can have at the moment, then fair enough - it is still much better than nothing.
11.5.2. Things to Look For
Now let us assume that we are under significantly less time pressure, and so have time to really look into the test code. Here is the list of things we should be looking for.
Basically, you should pay attention to the same code features as when code reviewing production code. Are the methods short and focused? Has the code been written at the right level of abstraction? Are there any global variables and magic numbers? And so on… In the subsections below, I will be trying to focus on test-related checks.
Some of the hints below are written from the point of view of a technical team leader, responsible for ensuring the quality of all code. For example, analyzing trends in code coverage reports is probably not something you will be doing on a daily basis. Use common sense to create your own checklist of the issues to look for when code reviewing test code.
Easy to Understand
A good unit test is easy to understand. But its readability can be spoilt by many small issues, which you should look out for.
A good test method has a content-revealing name, which gives information on the particular scenario implemented within it (see Section 9.2). Similarly, variables used in tests should be easy to understand: for example, can you tell which variable is an SUT and which are collaborators? Also, variables used within test code should inform you precisely about what their role is: are they here only to satisfy the API of the method being tested, or are they crucial to triggering some behaviour of the SUT or its collaborators? (see Section 11.6.3).
Are the test methods short and focused? They should test a particular feature of the SUT (see Section 10.1), and nothing more. Look for anything that goes beyond just the simplest actions (arrange/act/ assert).
20Please, do not take these values literally! They are only meant to be illustrative. See Section 11.3 for more information on desired code coverage levels.
251

Chapter 11. Test Quality
Can you find any for loops in the test methods, or instances of reflection being used to set up a test fixture? Both have some legitimate use, but usually cover up deficiencies in production code. Any violation of KISS should attract your attention.
Look for hidden dependencies between tests - they make test code much harder to understand. Depending on the order of execution (which is not guaranteed by JUnit), or relying on data created by another test method, should both arouse your suspicions. Also global variables, reused between many test methods, should be treated as a code smell.
Look for calls to some external APIs. Readability issues are often there, especially if called methods take many parameters. If you can not understand such code at a glance, then there is probably some room for improvement there (see Section 11.6.2).
Are test classes inheriting from some parent class? How many levels of inheritance are there? Inheritance kills readability.
A common issue is the mixing up of different styles within the codebase. Some developers value the arrange/act/assert pattern, some are more in favor of the BDD approach, some like to instantiate test doubles within test methods, while others prefer to rely on set-up methods for this, and so on. A good unit testing suite will be consistent in this respect, and while code reviewing you should also take a look at this. However, this is not something to be fixed easily, as it takes time for the team members to converge and agree on a common style, shared by all developers.
Another thing which impacts negatively on readability is making use of custom solutions, instead of relying on what your own testing framework offers. If you find any non-standard approaches to setting up test fixtures (something different from the use of @BeforeXYZ annotations) or running test methods with different parameters (e.g. running tests in for loops instead of using parameterized tests - see Section 3.6), then by all means kick up a fuss about it.
Similarly, you should take a look at the structure of the test classes. Does each particular part of the test class (e.g. data providers, private utility methods, set-up method, etc.) always appear in the same order? If not, this might be something to fix.
The existence of duplicated code fragments might indicate that the "refactor" phase of the TDD cycle (see Section 4.2) is not being treated seriously. On the other hand, if the duplicated code helps to make the tests more readable, I would leave it alone. This is somewhat different from the situation with production code, where repeated code is almost always a code smell (see Section 11.6.6).
Look at assertions. If a test fails, will you know exactly why? Are the right assertions being used? Are assertion messages clear? (see Section 8.4)
If there is any logic involved (e.g. iteration over the collection returned by the SUT to find out if it contains certain values), then shouldn’t it perhaps be encapsulated within a custom matcher class? (See Section 6.6.)
Test doubles are a common source of readability problems. Are the right ones being used? Are the expectations concerning executed methods clear? Is it clear what is verified and what is only stubbed? It is also common for maintainability issues to arise here. Look for overspecified tests (see Section 10.4.1). Are matchers used properly (see Section 6.7)?
The creation of objects (see Section 9.6) can also be a weak point of tests. It is common to see a lot of copy&paste code in the test fixture setup parts of test classes, or to find many obscure private methods calling one another to set domain objects in certain states. Definitely something to have a look at.
252

Chapter 11. Test Quality
Documented
Well-written unit tests usually do not need documentation (see Section 9.3). However, it sometimes transpires that you come across things which you wish had been documented, but are not. For example, the selection of test cases might not be obvious (e.g. "why is this method validated against Dubai and Sydney timezones?"). Probably there is some business explanation for this, which should be added as a comment (sometimes a link to bug tracker issue is all that is required). If it is not there, then you cannot determine whether the test cases are covering all the important scenarios.
Are All the Important Scenarios Verified?
The most important question to be answered is whether all the important test cases are covered by tests. Often you will find only single executions of tested methods, which is definitely not enough to verify their correctness (see Section 6.1). This is a sign of "happy path" testing, and definitely something to fight against.
Another source of undertested code results from concurrency. If your code is intended to be accessed concurrently, then this is exactly how it should be tested (see Section 6.10). If not, then even 100% code coverage is nothing to be proud of.
We have already discussed one area of code coverage usefulness for test code reviews (that is, its ability to uncover areas not touched by tests at all). However, more can be learned from studying the coverage report, even if the whole idea of measuring test code quality by looking at code coverage is flawed (see the discussion in Section 11.3).
Study the coverage reports to find answers to the following questions (all of them can uncover potential issues):
•Can you see a pattern of untested code across multiple classes? I often find that there are no tests for exceptions thrown. Usually this happens when real collaborators are used instead of test doubles, which makes it hard to simulate some possible paths of code execution.
•If you happen to have historical data on executed tests (your continuous integration server should provide such reports), then see how the number of tests and code coverage measurements change. Ideally the number of tests should grow, and code coverage should at least fluctuate around the same values. If you see some different behaviour, you need to inspect it further.
Additionally, you could apply mutation testing verification (see Section 11.4) to find weak spots in your tests.
Run Them
Some of the issues relating to running tests during code reviews have already been discussed. However, it is now time to have a closer look at them.
Remember, always execute the tests which are under code review.
Take a look at the build script. See if there are any conditions under which tests are not run (e.g. Maven profiles). Make sure the tests are not skipped.
253