Agile Zone is brought to you in partnership with:

Israel Gat ("agile_exec") is recognized as the architect of the agile transformation at BMC Software. Under his leadership, BMC software development increased Scrum users from zero to 1,000 in four years. Dr. Gat currently focuses on technical debt, large-scale implementations of lean software methods and devops. Israel is a DZone MVB and is not an employee of DZone and has posted 36 posts at DZone. You can read more from them at their website. View Full User Profile

What 108M Lines of Code Tell Us

  • submit to reddit

Results of the first annual report on application quality have just been released by CAST. The company analyzed 108M lines of code in 288 applications from 75 companies in various industries. In addition to the ‘usual suspects’ –  COBOL, C/C++, Java, .NET – CAST included Oracle 4GL and ABAP in the report.

The CAST report is quite important in shedding light on the code itself. As explained in various posts in this blog, this transition from the process to its output is of paramount importance. Proficiency in the software process is a bit allusive. The ‘proof of the pudding’ is in the output of the software process. The ability to measure code quality enables effective governance of the software process. Moreover, Statistical Process Control methods can be applied to samples of technical debt readings. Such application is most helpful in striking a good balance in ‘stopping the line’ – neither too frequently nor too rarely.

According to CAST’s report, the average technical debt per line of code across all application is $2.82.  This figure, depressing that it might be, is reasonably consistent with quick eyeballing of Nemo. The figure is somewhat lower than the average technical debt figure reported recently by Cutter for a sample of the Cassandra code. (The difference is probably attributable to the differences in sample sizes between the two studies). What the data means is that the average business application in the CAST study is saddled with over $1M in technical debt!

An intriguing finding in the CAST report is the impact of size on the quality of COBOL applications.  This finding is demonstrated in Figure 1. It has been quite a while since I last saw such a dramatic demonstration of the correlation between size and quality (again, for COBOL applications in the CAST study).

Source: First Annual CAST Worldwide Application Software Quality Study – 2010

One other intriguing findings in the CAST study is that “application in government sector show poor changeability.” CAST hypothesizes that the poor changeability might be due to higher level of outsourcing in the government sector compared to the private sector. As pointed out by Amy Thorne in a recent comment posted in The Agile Executive, it might also be attributable to the incentive system:

… since external developers often don’t maintain the code they write, they don’t have incentives to write code that is low in technical debt…


Congratulations to Vincent Delaroche, Dr. Bill Curtis, Lev Lesokhin and the rest of the CAST team. We as an industry need more studies like this!


Published at DZone with permission of Israel Gat, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)



Anthony Mak replied on Tue, 2010/09/28 - 9:07pm

 Hi Israel,

 Great article.

 >… since external developers often don’t maintain the code they write, they don’t have incentives to write code that is low in technical debt…

I find your reason why government code seems to have poorer quality very englightening.

 So as Size increase -> Total Quality Index decrease. Is there any study or paper on how best to tackle or reverse this problem?

 In your study, are there any special circumstances where increase in Size do not lead to a decrease in Total Quality Index?

 Anthony Mak

Alessandro Santini replied on Wed, 2010/09/29 - 6:46am

Dear Dr. Gat,

thanks for this interesting reading - I am relatively new to the concept of Technical Debt and statistics are indeed not my best area of expertise but nevertheless I tried to draw some conclusions out of this article.

  1. Surprisingly, the highest TQI is not achieved with the smallest number of code lines; the highest TQI is achieved by an application which is roughly 10x in size. The same can be said for the lowest TQI which is at 1,000 kLOC instead of 10,000 kLOC.
    This makes me wonder if there is uniformity in the kind of samples adopted; are all these applications written for the same industry? Is their logical complexity comparable? Have these application been developed by teams with comparable skillsets, experience and education?
  2. Looking at the graph I can see that the highest TQI is 3.7 and the lowest is somewhere close to 2.75; per se, it does not seem a dramatic drop considering that the increase of LOC is 1000x (or 100x if you consider highest and lowest TQI). Now, bearing in mind the $2.82 of average technical debt, is there a function to correlate the TQI and a technical debt amount?
  3. Is there a correlation between other software metrics like cyclomatic complexity and technical debt?
  4. Just a note, the use of a linear scale for the X axis would have emphasized that the TQI decreases not so linearly as it seems at first sight.



Stephane Vaucher replied on Thu, 2010/09/30 - 1:01pm

Honnestly, looking at the abstract, I find that is does not say much. My main concern is that the authors do not provide enough information to support their claims. Specifically:
  1. It is unclear what they consider a violation. Their tools might support 10X more potential violations for Java code than for COBOL code. That alone might explain findings 1 and 2
  2. Using static analysis is a stretch to assess performance. Their tools might not even be able to identify if it is analysing dead code.
  3. Size is a factor that impacts quality? Funny, I would have said the opposite. Bad quality software might be bloated because a change requires more code. Old Cobol code might be less understandable (to new coders), thus changes might be bigger and then break encapsulation... Correlations go both ways. Also, size is generally correlated with the age of a system, and age is correlated to the number of developers who touched the system. Thus, age and # of devs are confounding cases of bad quality. Everything I said are equally valid as their explanations.
  4. Finding 6 about top code violations is a bit weird. Gotos are not bad. It is the unstructured programming that is allowed by gotos is bad. In the report, the authors do not distinguish good from bad gotos. The authors also seem to find funny the fact that refactoring is not performed to legacy cobol code. Cobol refactoring is not as easy and safe as Java/.Net refactoring. If we consider that the main cost of development iterations of legacy Cobol code comes from the execution of regression tests, obviously there are many gotos in the code. You could say these gotos produce technical debt, but only if used in a dangerous manner and if it is not the standard way software is developed in the organisation.
  5. Finally, I'm not sure what is the justification for the statement that high fan-out is bad. High fan-out is sometimes correlated to faults, but rarely when size is controlled. Additionally, high fan-out can be a sign of software reuse.
  6. Their conclusion that good practices are slow to be adopted is odd. These are large systems. Do the authors really expect for a company to hire a team of maintainers to clean up their code if there is no business reason? Let's remember Y2K. How much did that cost the industry to check on the number of digits used in a system, and how much would it to correct these violations? We can talk about technical debt, but, as they state, it is a loan that for the software owners are willing to pay.

Baruch Atta replied on Mon, 2010/10/25 - 3:32pm

If the "...Results of the first annual report on application quality..." could be believed, then I would be impressed. However, the measures and graphs all boil down to "...CAST hypothesizes that the poor changeability might be due to higher level of outsourcing in the government sector...".

In other words, it is all still a hypothesis, black magic, waving of wands, wearing of funny masks.

We have known for decades what good code looks like. We know what bad code looks like. We know if a program can be modified easily. We know that good systems have good documentation.

So, I view with much skepticsm any new attempt to measure "application quality".

As for "average technical debt per line of code" measures, I have no doubt that bad, spaghetti, undocumented code is more costly to maintain, and the figure of $2.82 per line of code may be accurate. Or not. What is that figure compared to? What is the average technical debt per line of code for well written, well documented code?

The difference between the cost to maintain bad code to the cost to maintain good code, that is important. Is the cost of extra maintenance of bad code worth the re-engineering effort to transform to good code? That is the "to be or not to be" question. Is it worth it to re-engineer? I say - in every case - YES!

Is it better to migrate to a "more modern" platform, from COBOL to JAVA?

I say, in very many cases, no. No, because the risks of migration are high, and the costs are even higher. The return is marginally small.

It is probably cheaper and less risky to migrate from a bad coded COBOL system to a good coded COBOL system, and then if desired, to a new platform, than it is to migrate directly to a new platform. I know, I have done it a few times.

Two final comments.

Performing a GOTO to a paragraph-end is considered good programming practice in COBOL. It is an easy way to bypass without deep nested IF statements. And it doesn't add to complexity, just the opposite.

And I would disagree with the report, and say that the reason that government owned systems are harder to maintain is not because of the outsourcing. The real reason is due to strict cost controls, and the lack of constant code improvement efforts. That is, government code is left to run longer without modifications, than commercial systems code. It is the cost containment in government systems that costs more in the long run. Penny smart but pound foolish. That's my two cents.

Baruch Atta

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.