Agile Zone is brought to you in partnership with:

John Sonmez is a Pluralsight author of over 25 courses spanning a wide range of topics from mobile development to IoC containers. He is a frequent guest on podcasts such as DotNetRocks and Hanselminutes. John has created applications for iOS, Android and Windows Phone 7 using native tools, HTML5, and just about every cross platform solution available today. He has a passion for Agile development and is engaged in a personal crusade to make the complex simple. John is a DZone MVB and is not an employee of DZone and has posted 84 posts at DZone. You can read more from them at their website. View Full User Profile

We Can’t Measure Anything in Software Development

02.12.2013
| 14031 views |
  • submit to reddit

Baccarat is an interesting card game that you’ll find at many casinos.  The objective of the game is to correctly predict whether the bank or player will win a hand.

cards

In Baccarat the scoring for a hand is very simple, add up all the cards at their face value with face cards being worth 10 and only count the total in the ones column.

6 + 7 + J = 23 = 3

A + 4 = 5

The highest possible hand is 9 and whoever has the highest hand wins.  If the player and banker have the same hand, it is a tie.

I won’t go into the details of how the number of cards are drawn is determined, but if you are interested you can find that information on Wikipedia.  Basically, you end up having pretty close to a 50 / 50 chance of either the player or banker winning a hand.  (Of course the house edge still is about 1.06% in the best case.)

The interesting thing about Baccarat though, is that despite the odds, despite common sense, despite the understanding that the game is completely random, people will still sit there and record every single hand and score trying to use it to look for patterns to predict future results.

These poor deluded souls actually think they are measuring something on these score cards, as if what happened in the last hand will in any way affect what will happen in the next hand.

After many years of trying to find the secret formula for measuring software development activities, I’ve come to the conclusion that trying to measure just about any aspect of software development is like trying to measure the odds of a future Baccarat hands based previous Baccarat hands.

Why we want to measure software development

It’s understandable why we want to measure software development—we want to improve.  We want to find out what is wrong and fix it and we want to know when things go wrong.

After all, who hasn’t heard the famous quote:

“What gets measured gets improved.”

Don’t we all want to improve?

Somehow we get stuck with this awful feeling that the opposite is true—that what doesn’t get measured doesn’t get improved.

guilty

And of course we feel guilty about it, because we are not doing a good job of measuring our software development practices.

Just like the avid Baccarat gambler, we want to believe there is some quantifiable thing we can track, which will give us information that can give us the edge.

Sometimes the reason for wanting to measure is more sinister practical, we want to evaluate the individuals on our team to see who is the best and who is the worst.

If we could figure out how to measure different aspects of software development, a whole world of opportunities open for us:

  • We can accurately give customers estimates
  • We can choose the best programming language and technology
  • We can figure out exactly what kind of person to hire
  • We can determine what kind of coffee produces the best code

How we try

I’ve been asked by many managers to come up with good metrics to evaluate a software development team.

I’ve tried just about everything you can think of:

  • Lines of code written
  • Bugs per developer
  • Bugs per line of code
  • Defect turn around time
  • Average velocity
  • Unit test code coverage percentage
  • Static analysis warnings introduced
  • Build break frequency

I’ve built systems and devised all kinds of clever ways to measure all of these things.

I’ve spent countless hours breaking down backlogs to the smallest level of detail so that I could accurately estimate how long it would take to develop.

I’m sure you’ve probably tried to measure certain aspects of software development, or even tried to figure out what is the best thing to measure.

It’s just too hard

No matter what I measure or how I try to measure it, I find that the actual data is just about as meaningless as notebook full of Baccarat hands.

One of the biggest issues with measuring something is that as soon as you start measuring it, it does start improving. 

What I mean by this is that if I tell you that I am going to start looking at some metric, you are going to try and improve that metric.  You won’t necessarily improve your overall productivity or quality, but you’ll probably find some way—intentional or not—to “game the system.”

Some managers try to get around this issue by just not telling the team what they are being measured on.  But, in my opinion, this is not a good idea.  Holding someone accountable to some realistically arbitrary standard without telling them what, is just not very nice at all, to put it mildly.

But really the biggest reason why it is too hard to measure aspects of software development, is that there are just way too many variables.

  • Each software development project is different
  • Each feature in a project is different
  • Software developers and other team members are different
  • From day to day even the same software developer is different.  Did Jack’s wife just tell him she was cheating on him?  Did Joe just become obsessed with an online game?  Is Mary just sick of writing code this week?
  • As you add more unit tests the build time increases
  • Different team members go on PTO
  • Bob and Jim become better friends and chat more instead of work

The point is everything is changing every day.  Just about every aspect of software development is fluid and changing.

There is not one metric or even a set of metrics you can pick out that will accurately tell you anything useful about a software development project.  (At least I have never seen one at any software development shop I’ve ever been at on consulted at.)

factory

If you were building widgets in a factory, you could measure many qualities about that widget making process, because much of it would be the same from day to day, but with software development, you are always exploring new territory and a 1000 different variables concerning how you are developing the software changing at the same time.

Measuring without measuring

So am I basically saying that metrics in software development are completely worthless and we shouldn’t bother to track anything?

No, not exactly.

What I am saying is that trying to use metrics int the same way that we measure the average rainfall in a city, or running pace improvement by looking at its average over time, doesn’t really work in software development.

We can track the numbers, but we can’t draw any good conclusions from them. 

For example, say you track defects per lines of code and that number goes up one week, what does it mean?  Any number of things could have caused that to happen or it could just be a totally random fluke.  You can’t really know because there isn’t a knob you can turn and say “ah, I see we turned up the coffee bitterness factor to 3 and it resulted in more bugs.”  Instead there are 500 knobs and they all changed in random directions.

knob

So, I am saying don’t look at how the numbers of any particular metric are moving from day to day or week to week and expect that it means anything at all, instead look for huge deviations, especially if they are sustained.

If all of a sudden your average team velocity dropped down to almost nothing from some very high number, you won’t know what caused it, but you’ll know that it is much more likely that there was one single knob that got cranked in some direction and you’ll at least have some idea what to look for.

You really have to treat the software development process more like a relationship than like a factory.

I don’t have a series of metrics I use to evaluate my relationship with my wife or my friends.  I don’t secretly count how many times my wife sighs at me in a day and track it on a calendar to determine our relationship quality factor.

Instead what I do is talk to her and ask her how things are going, or I get a more general idea of the health of the relationship by being involved in it more.

Team retrospectives are a great way to gauge the temperature of the team. Ask the team members how things are going.  They will have a pretty good idea if things are improving or slowing down and what the effectiveness level is.

Measure not, but continuously improve, yes

So kick back, don’t worry so much.  I promise I won’t tell Six Sigma that you aren’t using metrics.

Instead focus on continuously improving by learning and applying what you learn.  If you can’t notice enough of a difference without metrics, metrics wouldn’t have helped you anyway, because the difference would just be lost in variance anyway.

 

Published at DZone with permission of John Sonmez, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Hendy Irawan replied on Tue, 2013/02/12 - 8:13am

 Some can be relatively successful in doing this.

For example, Ubuntu releases a new major version every exactly 6 months. Maybe they're not perfect. But it's not random either. Asking how people are doing helps, but a formal process also helps in a significant way.

Dapeng Liu replied on Wed, 2013/02/13 - 12:14pm

great analogy between software development and relationship 

John Sextro replied on Wed, 2013/02/13 - 8:37pm

Wow.  Does ThoughtWorks Studio know they are sponsoring this article?  I doubt that Martin Fowler would approve of this sentiment.

Cindy Joanning replied on Wed, 2013/02/20 - 11:10am

 I so agree. If you have a good team lead who knows how to "read" the team and make adjustments, then things go much more smoothly. Such as reassigning critical work when someone gets ill or realizing when someone gets overwhelmed and helps out. This sort of thing is hard to measure.

Hubert Rabago replied on Wed, 2013/02/20 - 12:50pm

It can be hard to explain to someone who doesn't write code for a living and doesn't interact with several other coders of different levels how metrics can be misleading at best.  Sometimes they try to apply metrics that work in other industries, such as sales or customer support, and question why it doesn't work for software development. 

The closest metaphor I could come up with is Art. To an outsider like me, it would be hard to put metrics that explain how one artwork is better or worse than another.  There are so many qualities that can be used, but they are all hard to quantify.  Do the colors blend well, is the shadow just right, is the texture too much?  The correct answer for each is "it depends".  Is the artist going for a darker emotion, or a sunnier one?  Is the target demographic kids in a daycare or an art gallery downtown?

It's similar with software.  Which patterns were applied?  How well tested is it?  How maintainable is the code?  And the follow up to each of these questions is - is that the right answer?  The pattern may be incorrect, there may be too many tests for minor features but not enough for the major features, or it may even be overengineered for something that will only be used once or twice a year in some background process.

There's a reason they say a good programmer can be 28 times more productive than an average programmer. I don't know if the number there is correct, but I believe in the sentiment.  There are so many ways that the work can be done better, but I can't say I know how to quantify any of them.  Not even "finish on time".

Florin Jurcovici replied on Thu, 2013/02/21 - 1:53am

IMO that's a stupid idea. Measuring how many tests fail, looking at cyclomatic or halstead complexity of the codebase, and taking action, all of these do help and make sense.

You might not be able to measure everything, but you definitely can't improve what you can't measure.

Maybe you need to learn more about measuring - measuring isn't just putting a fixed number on something.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.