I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 638 posts at DZone. You can read more from them at their website. View Full User Profile

Software versions, the necessary evil

04.10.2012
| 4839 views |
  • submit to reddit

From the dawn of time, versioning of releases has been an arbitrary way to indicate the advancement of software. Version numbers ranging from 1.0 to 2.6.42p23 were specified with a per-project policy, and developers tried to pin their dependencies to the most precise versions to avoid regressions.

The goal of version numbers is describing the evolution of the API of a library or an application, listing which new features are available but also backward compatibility breaks that should be dealt with before a possible upgrade. But the JAR Hell and DLL Hell scenarios tell us that having too many versions of a library around is not desireable.

Semantic Versioning

Semantic Versioning is a numbering scheme that puts some prescription that releases have to respect:

  • 1.x -> 2.x is a new major release, which may contain API breakage.
  • 1.1.x -> 1.2.x is a new minor release, which may contain new features or API methods. At the same time, minor releases cannot change the API in an incompatible way.
  • 1.1.1 -> 1.1.2 is a new patch release, which contains only bug fixes. Patch releases cannot add new functionality or API changes.

Semantic versioning is indeed an improvement over wild numbering schemes, but I think it encourages old-style releases, with multiple branches to maintain and bug fixes to backport. For example, it's common in this model to have a 1.1 product line branched from a current 1.2, and to backport every available bug fix (but not features) from the latter to the former. I guess semantic versioning assumes a project of a large size that can bear the costs of multiple branches easily.

The problem of semantic versioning is also that with many frequent releases the scheme would easily get to 1.80 versions. Bug fixes and new features may not be mixed in a branch, while the Feature Toggle model is not applicable.

Thus are we sure multiple release branches are always the right solution?

The infinite version

At the other extreme of the spectrum we find the infinite versioning scheme proposed by Jeff Atwood. For example, Chrome updates silently and by making use of large compressed binary diffs, so that the process is seamless for the user. It doesn't make sense to talk about versions when they change every hour.

Wordpress can update too mostly automatically (but not silently): given the right permissions on its folder, it can detect a new version via the browser itself, download it and substituting the old files. When you have a fast and active update channel, it's easier to bump up version numbers with frequent releases.

The special requirement that has lead both these projects to this updating mechanism is the availability of security fixes: it's more important to close a bug that exposes the user's files than to allow him to pin to a fine-grained version number.

Of course silent update it's not an option for libraries like my project PHPUnit_Selenium: apart from the technical difficulties in the PEAR or other packaging scheme, the principle of repeatable buils tells us that if a version works we should stick with it until we are ready to upgrade and check possible deprecated, removed and added features. The last thing we want is a red build caused indipendently from our production code, generated by a library updating silently.

Seamless updates

But in the single branch approach, bug fixes and additions to the API are mixed together, and released very frequently. This approach may add some maintenance costs due to the frequent updates, but we all know integration is simpler when done in small batches. Just like we push our changes to the CI server every day, why we do not upgrade libraries at least every month?

Imagine how the world will be simpler if updates were more frequent: PHP 4 would have been dead for a long time, and PHP 5.2 would have been almost vanished now that 5.3 and 5.4 are maintained. The cost of maintenance would be shifted from the open source team (maintaining multiple branches and backporting code) to the development team doing continuous upgrade.

Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)