Agile Zone is brought to you in partnership with:

I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 636 posts at DZone. You can read more from them at their website. View Full User Profile

Practical PHP Refactoring: Substitute Algorithm

07.12.2011
| 5678 views |
  • submit to reddit

Small-scale refactoring is usually composed of small changes, which keep the system working all the time; you extract some lines into a new method, or add a few variables to substitute comments with their descriptive names.

As we move towards a larger scale, we find that sometimes little jumps are needed, such as changing an entire algorithm instead of only moving code around.

Why should I change my algorithm if the tests already pass?

  • Because you have discovered a better one with respect to non-functional requirements: it is faster, occupies less space, or more reliable in case of stochastic algorithms (e.g. your collection has increased from an average of 10 elements to a sizeable number: Quicksort may help you).
  • Or because now your library, or your ORM, or your framework implements the same functionality for you; so you want to bind to another debugged implementation to avoid maintain this piece of code (e.g. your database can sort your User object by name for you).
  • Or because the requirement has been superceeded, and a less strict version of the algorithm is required (e.g. you just need to sort according to one field, and not two).
  • Or because the algorithm should change for a new requirement, and the old one cannot be modified accordingly (e.g. you need a stable sorting algorithm).

Constraints

To avoid desperate cases of shotgun surgery, impose some constraints on this refactoring; it's always easy to start messing around.

Keep the change as localized as possible: you can take possible cautionary steps to isolate the algorithm in a method: Extract Method, and Inline Method to keep all the algorithm in one place can help you. Ideally, even in the complex cases this refactoring should not become more difficult than changing a Strategy object.

In fact, the Open/Closed Principle is a guideline to follow: if you're changing the algorithm because of a new (or mutated) requirement, the next change should only require you to add classes and reconfigure the construction of the object graph.

Thus a preliminary refactoring of the original algorithm may help, especially if the old and the new are composed of similar steps.

To check if the new algorithm will cover the already existing cases, locate the right level of tests to run:

  • unit tests of the class if you're working on private or self-contained public methods;
  • functional tests involving multiple objects, if the algorithm is divided into multiple classes;
  • even end-to-end tests if the algorithm involves wildly different pieces of the system (e.g. the sorting of a client-side view list after an Ajax insertion).

Steps

Step 0 is to isolate the algorithm as we defined in Constraints.

  1. Prepare the new algorithm in a design that is compatible with the old one at its terminals (that's electrical engineering terminology): input parameters, output value, changes to the object state. You can refactor the protocol later, but for now it's easier to stick to the old one, for example continuing to require a parameter which would now be unused.
  2. Switch the client code calls from the old method to the new one.
  3. Run the identified tests.

In case the tests now fail, you're under version control and can go back to the original one; or you can identify which test cases are causing the failure and debug the new algorithm.

Example

In the initial state, the algorithm is part of the display method that should produce a bit of HTML. It is a reuse of the classical sort() PHP method that acts on keys instead of values.

<?php
class PhonebookTest extends PHPUnit_Framework_TestCase
{
    public function testPrintsNameAndNumberOrderedByName()
    {
        $phonebook = new Phonebook();
        $phonebook->add('Giorgio', '031...');
        $phonebook->add('Adam', '021...');
        $phonebook->add('Zend', '045...');
        $html = $phonebook->toHtml();
        $this->assertEquals(
            "<ul>\n"
          . "<li>Adam: 021...</li>\n"
          . "<li>Giorgio: 031...</li>\n"
          . "<li>Zend: 045...</li>\n"
          . "</ul>\n",
          $html
        );
    }
}

class Phonebook
{
    private $numbers = array();

    public function add($name, $number)
    {
        $this->numbers[$name] = $number;
    }

    /**
     * I thought about writing a sorting algorithm, but...
     * I opted to use a different, bundled sorting algorithm :)
     */
    public function toHtml()
    {
        $names = array_keys($this->numbers);
        sort($names);
        foreach ($names as $name) {
            $numbers[$name] = $this->numbers[$name];
        }
        $this->numbers = $numbers;

        $html = "<ul>\n";
        foreach ($this->numbers as $name => $number) {
            $html .= "<li>$name: $number</li>\n";
        }
        $html .= "</ul>\n";
        return $html;
    }
}

After a while, we discover PHP already handles this functionality with a native function, and we want to substitute the long and error-prone algorithm with the native one (just like if we introduced a new library).

The first step is to isolate the algorithm in its own method:

class Phonebook
{
    private $numbers = array();

    public function add($name, $number)
    {
        $this->numbers[$name] = $number;
    }

    public function toHtml()
    {
        $this->sortNumbersByName();

        $html = "<ul>\n";
        foreach ($this->numbers as $name => $number) {
            $html .= "<li>$name: $number</li>\n";
        }
        $html .= "</ul>\n";
        return $html;
    }
    
    /**
     * Let's start by isolating the algorithm.
     */
    private function sortNumbersByName()
    {
        $names = array_keys($this->numbers);
        sort($names);
        foreach ($names as $name) {
            $numbers[$name] = $this->numbers[$name];
        }
        $this->numbers = $numbers;
    }
}

Now we can substitute the old code with the new algorithm, incidentally just a single call. If there is no more bookkeeping code to add in the extracted method, we can even inline it again; however the temporary extraction has forced us to define the input (only $this) and output (state of $this) relationship of this little algorithm.

    private function sortNumbersByName()
    {
        ksort($this->numbers);
    }
Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)