I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 637 posts at DZone. You can read more from them at their website. View Full User Profile

Bullets for legacy code

03.27.2012
| 12628 views |
  • submit to reddit
The most common legacy code definition is that of a project not covered by automated tests: picture in your mind a big ball of mud, difficult to change or extend with new features. Working on legacy code is not the same as dealing with a green field project: several specialized strategies have been developed over the years due to the peculiarity of this code.

The circular dependency

When you see a big ball of mud, refactoring is the keyword that comes to mind. But any refactoring technique requires tests to be in place, to ensure functionalities aren't broken by the refactorings you apply. Extracting a class incorrectly can ground the whole application, especially in dynamic languages: we need a safety net of tests for modifying the code.

At the same time, ease of testing requires refactoring to isolate classes or packages: try to test a class creating a DatabaseConnection object and reading 3 configuration files into its constructor.

Thus there is a circular dependency between refactoring and unit testing that we try to break a bit at the time when dealing with legacy code.

Special techniques

I'm learning many techniques for breaking this circular dependencies from Working Effectively With Legacy Code by Michael Feathers, whose main idea is that there different paths towards a better design, some taking bigger jumps and some less difficult and prone to break the code.

Extract Class is an example of powerful but invasive technique. Adding intermediate steps or event taking a detour is a suboptimal choice with respect to the best design (the final goal). But it lets you insert the changes needed now, privilege the present over the future a bit; like @jacoporomei would say you have to pay the interests of your technical debt.

For example, Extract Method and Override is a technique for testing by far less invasive than a collaborator, even if it produces a less useful abstraction (a method signature) and doesn't simplify the class by breaking it down into pieces.

I would rarely perform Extract Method and Override on new code, as I feel composition of objects has an higher ROI than inheritance. But in the context of legacy code, I have to care more about not breaking functionality and spending my time (read money) on both refactoring and new behavior.

Safe refactorings

Here's a catch for statically typed languages: some refactorings are available in a safe way and can be executed automatically by an IDE. For example Eclipse's Extract Method on a Java class generates the new method passing as parameters local variables and fields used, and rebinding the output with a call to that method.

In this case automated refactorings are not only a time saver tool but also a error-saver as at least the code still compiles; I don't think there is a way to guarantee their correctness without zombie Turing intervening, but I've never encountered a breakage caused by this operation (if there are multiple outputs from a piece of code, the IDE will usually renounce and tell you it's not possible to extract it automatically.)

The Mikado Method

Unlike for Feathers's book, the assumption of the Mikado method is that you have a suite of test, at least at the end-to-end scope, and you also rely on compile-time checks as an nice-to-have check.

The method works by creating a graph of operations to perform: most of them are refactorings (like breaking a dependency with an interface, or extract a common class). In steps, you create a graph that starts from the goal (new behavior) and generate new nodes to solve the errors you encounter: they are dependencies of the attempted task. For example, move a method in a collaborator object may require you move some fields first, which in turn may require visibility to be changed.

With the Mikado method, you don't chase each of the new nodes directly, but revert the changes that cause a breakage and repeat the process on the new nodes. Eventually you will reach leaves, that are the simplest moves you can make to shorten the distance to your final goal without getting a red bar.

The method lets you jump from a green state to a green state, instead of moving into red territory, where you never know if the next move will take you back to a working test suite.

The name of the method derives from the Mikado game, where the goal is to pick a particular stick in this mess:

 


 

The winning strategy is to pick the leaves like the stick on top first, because trying to pick the stick you want first will result in destroying everything else around it.

And finally, Kent Beck's 4 strategies for design

In his Responsive Design talk (screen sizes aren't the topic here), Kent Beck makes a brain dump of the strategies he is using for working with existing code (and adding new one). It's a good thing to formalize and be aware of the different roads you can take from going from A to B and add a feature in an existing mess. We often have a bias towards one of two of these techniques:

  • Leap: you simply jump from A to B, adding code until the feature works. This is mostly done for simple goals, and it may be dangerous for a legacy application you're unfamiliar with, as you may break unrelated things and not knowing which of your 100 lines of code is the problem.
  • Parallel: a road less travelled (unfortunately) is to leave multiple designs in place for a feature, phasing in the new while phasing out the old. This is common practice with libraries (deprecating a method after creating a new one with a different signature) but can be done also inside the project. You can define multiple interfaces on an object, so that it is asked to support the old set of methods plus the new one. Overusing this strategy will for sure lead to an explosion of code.
  • Stepping Stone: like on a pond or a garden, you create intermediate goals that will take you, when taken together, to the final goal. Extract a method, it's simple: then you can move it to a collaborator; then you can substitute the collaborator with another object for modifying the behavior.
  • Simplification: the best part of legacy code refactorings, but you have to put in a lot of effor before arriving to a point where you can do it. You have to extract, move and reconfigure code before you reach the point where a field or a method can be eliminated, or a class can be inlined because it does not serve a purpose anymore. I remembered this technique incorrectly: it consists of making many assumptions (much like we do with user stories and TDD) and solve a simpler problem as a preliminary step, before jumping into the real one.
Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Matt Avery replied on Tue, 2012/03/27 - 9:31am

I am so glad to see this discussion come back to the main stream.  Thank you for posting this!

Code without unit tests still appears to be the norm.  I'm a big fan of the "Extract Method" refactoring in Eclipse, so much so that I have the hot-key combo memorized (Alt+Shift+M).

 I also want to add a few of bullets of my own.

1. Create a test suite for each package you touch and a top level test suite that runs all of the test suites at once.  In Eclipse it's easy to add new unit tests to the package level test suites using "recreate test suite" from the package explorer context pop-up.

2. Use a code coverage tool to see what is being executed by your test suite.  One of the tangible benefits of code coverage is that it helps identify dead code.  Sometimes you will discover that some obtuse code that you have been trying decipher is not even being executed.  I use the EclEmma plugin for Eclipse.

3. Write integration tests and make recordings of their state or behavior.  Full disclosure here -- I am plugging my OS project, the test object recorder, ThOR .  This tool enables the developer to convert an integration test into a unit test by mimicing external systems, such as a database or web service.

Keep up the good work.  I have not heard of the Mikado method, so I'm off to read up on it and see if there is anything I can add to my toolbox.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.