TDD Pattern: Do not cross boundaries

(Note: This was originally drafted on 11/18/2003! I am finishing up old drafts for the new year)

TDD is fundamentally about writing tests (or 'checked examples' as Brian Marick calls them) before you write the code. One of the benefits of this is that the resultant code is decoupled from the rest of the system. This becomes particularly important when the code is near a boundary (e.g. acesses a database, a queue, another system, or even an ordinary class if that class is "outside" the area your trying to work with or are responsible for).

However this benefit doesn't just happen, you have to want to find a way to structure the code so that it can be tested without resorting to resources beyond the test.

I get asked about how to do this a lot when working with people learning TDD, so I thought I'd write a bit about how you can meaningfully test at these boundaries.

Code written test-last (or test-not-at-all) tends to have this kind of dependency pattern:

a.methodOne creates a reference to class b and calls b.methodTwo which creates and calls c.methodThree -- which then accesses some external system: a.1 -> b.2 -> c.3 -> external system.

However, TDD'd code tends to look like this:

methodA <- passed real object b <- constructed with fake object standing in for c

Its often difficult for people new to TDD to find ways to avoid having to use the actual external system (e.g. they want to use a test database). The difficulty usually manifests as:

"I can't see any way to write this code without creating the reference to the accessing object -- which won't work without the other system there."

or:

"I can see that I could mock or otherwise stub the interface, but then the test becomes meaningless."

In either case, the code they envision (and sometimes have already written) has this form (in this case a database call written in Universal Pseudo Java (UPJ)):

void UpdateCustomer(data) {
    db = new Database("SystemName")
    cmd = db.CreateCommand("ModifyCustomer")
    cmd.SetParam("name", data.Name)
    cmd.SetParam("address", data.Address)
    ...
    cmd.Execute()
}

And so they try to write a test like this:

void TestUpdateCustomer(data) {
    data.Name = "John Smith"
    data.Address = "123 Main St."  //new address

    obj.UpdateCustomer(data)
    results = cmd.Execute("SELECT * FROM Customer WHERE name = 'John Smith'")
    AssertEquals("123 Main St.", results["address"]);
}

And the first thing they say is: "How can I do that without the other system?" And they procede to write setup code that populates a copy of the database, and generally create a test maintainence nightmare.

Now I know what your thinking: "Yeah, Bill this is old hat (even back when I wrote this in 2003). Your going to say that they should stub out the database call, and populate a 'mock database' with some test data."

The problem with this approach, is that you end up spending a lot of time creating and maintaining a simulated database. Give it the ability to set data, query data, and respond to different queries, and pretty soon you are writing your own In Memory DB.

Using a real IMDB is an option that some use, but I generally prefer a different solution -- and thus a different design (remember TDD is about design, that's why the tests go first).

The real goal of our code is to call a stored procedure on a database -- not to make sure that the database gets updated (that presumably is the job of the stored procedure). So let's start with a test that is concerned with what we really want to test: that our code correctly configures and calls the SP:

void UpdateCustomerCallsAppropriateStoredProcedureWithProperParameters() {
    Customer data = new Customer()
    data.Name = "John Smith"
    data.Address = "123 Main St."  //new address

    FakeDatabase db = new FakeDatabase() //FakeDatabase implements IDatabase
    obj.UpdateCustomer(db, data)

    Assert(db.LastCommand != null)
    Assert(db.LastCommand.Type = StoredProcedure)
    Assert(db.LastCommand.Text = "ModifyCustomer")
    Assert(db.LastCommand.Params["name"] = "John Smith")
    Assert(db.LastCommand.Params["address"] = "123 Main St.")
}

(Note: The long ugly name serves as documentation. In fact, run a tool like TestDox on your tests and the name will be documentation).

Now our Update method takes a reference to its data accessor instead of creating it (this is the familiar IOC/Dependency Injection pattern).

This is a small change in our design, but a huge change in our test. Now that the test can control what db the call is made on, we can dispense with looking at the results of our database call, and focus our attention on what our code is actually doing.

The benefit of this small change is that our Fake Database doesn't need much logic at all now; it just needs to provide a way to get at the last Command, and override the 'Execute' method so that LastCommand is null if its not called:

class FakeDatabase : IDatabase {
    Command LastCommand { return _cmd }
    void Execute(cmd) {
        _cmd = cmd
    }
}

This is much easier to work with than a "Mock" that manages data. (a similar approach can do away with the need for an Object Mother that creates a whole structure of domain objects that you wish to test against).

Incidentally, I've been playing a bit fast and loose with Fake vs Mock here. I tend to favor the view that Mocks are fancy Fakes that make internal assertions. Others make further distinctions between stubs, fakes and Mocks. For the purposes of this example, we have a fake object that allows us to query state.

Other mocking approaches allow us to query behavior. Our test could also look something like this:

void UpdateCustomerCallsAppropriateStoredProcedureWithProperParameters() {
    Customer data = new Customer()
    data.Name = "John Smith"
    data.Address = "123 Main St."  //new address

    FakeDatabase db = new FakeDatabase() //FakeDatabase implements IDatabase

    db.Expect("SetCommand", data.Name)
    db.Expect("SetCommand", data.Address)
    db.Expect("Execute")

    obj.UpdateCustomer(db, data)

    db.Verify()
}

Now we've described our expectations in terms of what UpdateCustomer will do rather than describing the resulting state of those actions.

There are libraries available for some languages that allow you to generate mocks that let you query behavior like this (e.g. JMock and NMock both do this).

Regardless, on which mocking approach/philosophy/religion, you decide (in practice I use both), you can use either to test your code without having to cross a boundary to sense the results.

Database access code is just one kind of boundary coding problem, but the above pattern generally holds -- you can either mock the external facing interface (as above) or wrap it, if it doesn't provide a convenient interface (Michael's book has some great examples of how to do this).

A remaining objection you might have: "OK, I see how that might work -- but I don't want my caller to have to pass in the wrapper reference, I don't want my callers to have the ability to make that decision." It's not an uncommon requirement, and you have several options.

For example, you can wrap this call with an external call, then pass the reference to your 'real' method. Your clients then call the wrapping call:

void UpdateCustomer(data) {
    UpdateCustomerImpl(data, new RealDatabase("SomeSystem")
}

You then test the Impl method as we did the wrapper method before. Check out the earlier Dependency Injection link for other ideas and patterns (and Michael's book -- I am not getting a kick back, its just that I am reading it right now, and it really is relevant, which is why I chose this entry to de-draft first).

In short, being sensitive to boundaries (at whatever level of abstraction) can help us identify those times when we might want to start looking for ways to stub/fake/mock/simulate. Changing our thinking from testing the result of the call, to finding out whether the call is correctly formatted/executed makes testing in isolation easier. The result is a piece of code tested without crossing the boundary -- and not coincidentally code that's highly decoupled, cohesive, and programmed to an interface. Another example of how the goals of TDD encourage us to write code that is well-designed.