State-based and interaction-based unit tests differ greatly in style, although they each have the same end goal, which is verification of a unit of code. The ultimate difference between the two is what attributes of a unit of code are tested in order to consider whether or not the unit of code is correct. It’s important for a software developer to know both styles and to understand the differences between the two.
State-based testing
A state-based unit test is written in a style of unit testing that many software developers would be familiar with. In fact it wouldn’t be far off to call this “traditional” unit testing. In a state-based test, the first step is to initialize the unit under test. This initialization may include creation of test data and graphs of supporting objects necessary to exercise the unit under test. The test then exercises the unit by calling methods on it. When the test has finished exercising the unit, assuming no errors have yet been raised, the test then proceeds to verify expected state of the unit. In JUnit parlance, this verification is usually done using the assert*() methods. No matter what testing harness is used, this verification takes the form of testing state and raising errors if the actual state differs from expected state.
A little more about those supporting objects alluded to in the previous paragraph. One of the challenges of unit testing is to sufficiently decouple the multiple units of code that make up an application so that each unit can be tested individually. Otherwise, you end up with a “unit” test that’s really more like an integration test. Often this decoupling can be hard. A variety of techniques have been developed to help in this decoupling, but the most important technique is to program to interfaces instead of concrete classes. There should be an interface at all of the coupling boundaries of your unit of code. If this single condition is met, writing effective unit tests will become much easier. Often the supporting objects used by a test harness to properly exercise a unit are stubs. A stub is a trivial implementation of an interface that exists for the purpose of collecting state during a unit test.
Stubs are most often written by hand, although they don’t have to be. Usually you will end up creating a graph of stubs and trivial real objects that parallels the graph of real objects (both complex and trivial) present when the application is running. As the unit under test uses the stubs, they stubs collect state that can be verified at the end of the test. We’ll come back to stubs later on when we talk about mock objects. For now, keep in mind that stubs are often created by hand and exist to collect state for verification purposes.
How a state-based unit test works shouldn’t be a surprise to anyone who’s ever written a unit test. One of the advantages of this style of unit testing is its simplicity. It doesn’t take long to teach someone how to write tests in this style, and the overall behavior of the test is intuitive. It kind of feels like the way we might test something outside of the realm of software development.
The important thing to note is what criteria the state-based unit test is considering about the unit under test. It is, not surprisingly, all about state. If the unit under test reaches a certain state, it is considered correct. The state-based unit test doesn’t really care how the unit got to that state, but only that it is there at the end. The phrase “the end justifies the means” is apt here. The means (behavior of the unit) doesn’t matter nearly as much as the end (state). In fact, the only thing a state-based unit test verifies about the means is that the behavior of the unit didn’t include raising any errors during the test.
Are state-based unit tests effective? Absolutely. The best evidence of which is that most unit tests today are being written in this style. The end state of a unit is in many cases very significant from a verification standpoint. The end user of an application certainly cares more about state than behavior (here, behavior being that of internal objects and not external behavior of the application itself). When you’re using an online banking application the bottom line is that you want your balance to be the correct one.
Interaction-based testing
An interaction-based unit test is different. The easiest way to explain would be to say that an interaction-based test verifies the behavior of a unit instead of verifying the unit’s end state. From the point of view of an interaction-based test, the correctness of a unit is based on how it interacts with its neighbors, and not with internal state of the unit.
An interaction-based unit test first initializes the unit under test. This is done by creating “fake” stand-ins for all of the unit’s immediate neighbors. A neighbor is any object that the unit under test passes messages to (calls methods on). These stand-ins are called mock objects, and are usually created by a test framework library. When the initialization is complete, the only “real” thing in the graph of objects is the unit under test. Everything that the unit is hooked up to is a mock object - capable of receiving the same messages as the real object.
There is then a second step to the initialization of an interaction-based unit tests. After all of the mock objects have been initialized, expectations are set on the mocks. This is exactly what it sounds like - the test code programs the mocks and tells them what messages to expect from the unit under test. This can include many things such as what order to expect messages in, what the parameters to method calls should look like, and often how the mock should respond to these messages.
Contrast mocks and stubs. I know that in many places these two terms are used synonymously, but they are really very different from each other. A mock is often generated by a test framework library, while a stub is often created by hand. The internal state of a mock doesn’t matter at all - only the expectations it has about the messages it receives. A stub exists to collect state. Stubs are often used in conjunction with “real” objects. For instance, if an existing real object is very simple and lightly coupled to the rest of the code base, it is often brought in by testing code as a supporting object to the unit under test. A stub is normally only created when the real object can’t be used in a test harness for various reasons (coupling, external dependencies, etc). Mocks are used exclusively in an interaction-based unit test - all of the unit’s immediate neighbors are mocks, even when the real object is trivial.
The final stage in an interaction-based unit test exercises the unit. During this phase, the test code is invoking methods on the unit under test, which itself is interacting with the mock objects. A test error will be raised by the mocks if expectations are not met. An expectation can fail to be met for a variety of reasons, including if methods are called in the wrong order, if parameters have unexpected values, if the wrong methods are called, or if the right methods are not called. At the end of this phase the interaction-based unit test completes. There is no state-checking of the unit under test. If all expectations of the mock objects have been met, then the unit has been verified from the point of view of the test.
Are interaction-based unit tests effective? To really answer that question you have to look at the motivation for this style of testing. Object-oriented development might have been called behavior-driven development if it had been invented in this decade. (I know that there is an existing methodology termed behavior-driven development, but it is really based on OO being done right if you look closely). What is a more useful measure of the correctness of a unit - its behavior or its state? A OO purist would hopefully answer that behavior is key while state should be internal and encapsulated. An interaction-based unit test verifies behavior, while a state-based unit test verifies state.
Interaction-based unit tests are great for completely isolating the unit under test, and thereby are very true unit tests. A properly done interaction-based test cannot be testing anything other than the unit, while it is not hard for a state-based test to rely on side effects and be a little sloppy about boundaries between the unit under test and other units in the application. Interaction-based testing is most useful when applied uniformly throughout a code base. The objects that you create mocks for in a interaction-based test will themselves need interaction-based tests in order to feel confident about the code base as a whole.
Deciding which style to use
One of the premises of this article is that it’s important for today’s software developer to understand the large differences between state-based and interaction-based unit testing. However, for any of this to be of practical use, you will eventually have to make decisions about which style to use when writing a test.
I find that interaction-based testing feels a little unnatural. You can certainly make the argument that an interaction-based unit test is more tightly coupled to the implementation of a unit than a state-based unit test is (the counter-argument is that this is a good thing, not a bad thing). I’ve also noticed that interaction-based unit tests tend to have a lot more plumbing code that state-based tests. Coming back to a state-based unit test after a few months is usually easy - coming back to an interaction-based test often involves lots of inspection to get back up to speed. In other words, I posit that interaction-based unit tests have a higher maintainability cost than state-based tests. I think a lot of this is a reflection of the current state of languages and libraries in which interaction-based tests are written. This is still a fairly new testing style, and over time much improvement will be made in the libraries and frameworks that support it.
Finally, I’d like to suggest that different kinds of code benefit differently from each style. Some code is very stateful, and is best tested with a state-based test. Other code is more stateless and can be fully tested only with an interaction-based style. Perhaps the right question isn’t about knowing which style is better in an absolute sense, but more about being able to recognize which style will be most effective for a particular piece of code.
Further reading
Martin Fowler: Mocks Aren’t Stubs
If interaction-based unit testing is new to you, start by reading this article. You’re likely also going to have to spend some time writing interaction-based tests against a code base you know well before the differences between the two styles start to sink in. This is one of those cases where “armchair programming” isn’t going to help - you have to dive in and try it to fully appreciate the differences between the two styles.
Behavior-Driven Development (BDD)
Many of the motivations behind interaction-based testing come from an increasing focus on behavior and less on state. From the linked site: “It must be stressed that BDD is a rephrasing of existing good practice, it is not a radically new departure.”
Mock Roles, not Objects (PDF)
A paper written by the authors of JMock, a popular Java mock object library. The paper is short and very readable. I highly recommend it to anyone trying to understand the purpose of mock objects and how they differ from stubs and state-based testing.
State vs. Interaction Based Testing Example
A blog entry by Nat Pryce, one of the developers on the JMock project, that gives an example of a state-based and an interaction-based unit test for the same piece of code.