Agile Zone is brought to you in partnership with:

I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 635 posts at DZone. You can read more from them at their website. View Full User Profile

Asynchronous and negative testing

03.06.2012
| 7459 views |
  • submit to reddit

The photograph describing this article is a temperature probe for checking, of course, temperature differentials.

Test-Driven Development is a technique that can be applied at any level of detail: not only to classes and small groups of objects, but also at the system scale. In that case, you write end-to-end tests covering the input of the user and the end result, the response he gets or some side-effects.

Sometimes, these tests are asynchronous as the system has some internal workings that we want to encapsulate. For example we may want to test that a video is added to a playlist, and thumbnails are extracted.

Since this is a CPU intensive operation, it is commonly moved in the background once the video has been received. The POST request is then answered immediately.If we want to perform an assertion, then, we cannot check just after being answered by the system under test, since the operation has not finished yet.

By checking, I mean performing a new GET request to the playlist page, or in any case asking again the system about its state. We have to wait a non-deterministic amount of time before asserting something.

So how much should we wait?

Fowler's two methods for asynchronous testing

An anti-pattern for solving these issues is called bare sleep: it consists of waiting for X (milli)seconds with a sleep() call before making an assertion.

This approach would make the test fail sometimes, depending on the system's load or other uncontrollable conditions; the waiting time would also have to be tuned for each machine. It's a matter of luck for a test like this to pass, unless we put a very high waiting time which will slow down the test suite until it takes hours to run all tests together.

Martin Fowler describes two patterns for eliminating these flaky tests.

The first is the polling loop: checks are performed repeatedly for the presence of an answer, and the test fails only when the result is not correct, or does not arrive in a long timeout:

int limit = 30;
int elapsed = 0;
while (elapsed < limit) {
    bool result = makeAssertion();
    if (result) {
        break;
    }
    elapsed++;
    sleep(1);
}
if (!result) {
    fail("After 30 seconds the expected result has not been produced.");
}

In a green scenario, the test passes as soon as possible, given an high enough polling frequency. In a red scenario, it would fail after the time limit is reached. This should be a rare occurrence.

The second pattern is the callback: the system provides a synchronization mechanism, either for integration purposes or even for the end user, to know when an asynchronous action has ended. For example, a log could be written or a mail may be sent to the user when his video has gained thumbnails and has finally been added to the chosen playlist. Here's an example in Java, assuming the mailing system has been mocked:

// setup
int timeout = 30;
final AssertionToken synchronizationObject = new AssertionToken(); // an empty class
// inside the mailer mock, called when the operation has finished
public void send(String address, ...) {
    synchronized (synchronizationObject) {
        synchronizationObject.notify();
    }
}
// in the test
synchronized (synchronizationObject) {
    synchronizationObject.wait(timeout); // also have to catch exceptions
}
assertEquals(...);

Nat Price's method for negative testing

One of the authors of Growing Object-Oriented Software describes in one of his talks further examples of synchronization mechanisms for making assertions, which can even be cross-thread (synchronizing on Swing's event loop).

One of the techniques explained involves negative testing: how to test, asynchronously, that something does *not* happen.

The basic logic of this technique is:

  1. do(X).
  2. cannot assert that effect(X) does not happen in a finite amount of time. So do(Y).
  3. assert effect(Y) happened.
  4. assert effect(X) did not happen: due to ordering, it could have only happened before effect(Y). You can now safely perform an assertion.

To fix ideas, let's see a real life example, in Erlang code. This test involves two nodes, which are different processes possibly on different machines, Milan and Genoa.
The test should check that when we ask for a record starting with "a", we only get two responses: one from Milan and one from Genoa. Genoa has been passed the request: each node floods to the others it knows a reading request, before attempting to answer.

The request should stop after it has reached both nodes, and not be flooded again to Milan, back to Genoa and so on indefinitely.

flood_without_loops(Milan) ->
    Genoa = tuplenode:init(),
    tuplenode:input(Milan, {a, 1}),
    tuplenode:input(Genoa, {a, 2}),
    tuplenode:addFloodTarget(Milan, Genoa),
    tuplenode:addFloodTarget(Genoa, Milan),
    tuplenode:readNonBlocking(Milan, startingWith(a), 1001),
    receive
        {Promise1, _} -> ?assertEqual(1001, Promise1)
    end,
    receive
        {Promise2, _} -> ?assertEqual(1001, Promise2)
    end,
    % what to do now?

We cannot test deterministically that no other messages are received. Even if we wait, we won't be sure if the messages won't be sent: they could be in voyage between Genoa and Milan. From the production code (not shown) we are only sure that floods take place before request processing: so if the request has been flooded to Milan again, it has been sent by the end of the test because of ordering, since Genoa have answered. But we cannot wait for a message to *not* arrive.

In general, we cannot make isolated, negative, asynchronous assertions because the event we want to avoid can happen just after an assertion has been made (or far in the future.)

Thus the basic problem is: how to wait long enough to be sure the erroneous message has not been sent, but just enough to not run a slow, flaky test?
Pryce's assumption is that requests are processed in order. If you can guarantee that in this system, we can add this code to the test:

    tuplenode:readNonBlocking(Genoa, startingWith(a), 1002),
    receive
        {Promise3, _} -> ?assertEqual(1002, Promise3)
    end,
    receive
        {Promise4, _} -> ?assertEqual(1002, Promise4)
    end.

We issue a new request with id 1002 to Genoa. We get all messages that arrive (without any filtering), and the next two must have id 1002, corresponding to the new request.

Since:

  • Nodes run on a single event loop (a la Node JS);
  • messages are sent in order;

The flooded 1002 read message will be processed by Milan only after the previous 1001 one, since 1001 was flooded by Genoa before sending {Promise2, _}. So if we are receiving {Promise3, _} and {Promise4, _} one of them must come from Milan; then, 1001 is not around anymore or it would have delayed the two responses for 1002.

Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)