Agile Zone is brought to you in partnership with:

Ted Neward is the Principal at Neward & Associates, a developer services company. He consults, mentors, writes and speaks worldwide on a variety of subjects, including Java, .NET, XML services, programming languages, and virtual machine/execution engine environments. He resides in the Pacific Northwest. Ted is a DZone MVB and is not an employee of DZone and has posted 50 posts at DZone. You can read more from them at their website. View Full User Profile

Of communities, companies, and bugs (Or, “Dr Dobbs Journal is a slut!”)

08.06.2011
| 8426 views |
  • submit to reddit

Andrew Binstock (Editor-in-Chief at DDJ) has taken a shot at Oracle’s Java7 release, and I found myself feeling a need to respond.

In his article, Andrew notes that

… what really turned up the heat was Oracle's decision to ship the compiler aware that the known defects would cause one of two types of errors: hang the program or silently generate incorrect results. Given that Java 7 took five years to see light, it seems to me and many others that Oracle could have waited a bit longer to fix the bug before releasing the software. To a large extent, there is a feeling in the Java community that Oracle does not understand Java (despite the company's earlier acquisition of BEA). That may or may not be, but I would have expected it to understand enterprise software enough not to ship a compiler with defects that hang a valid program.


There’s so many things in this paragraph alone I want to respond to, I feel it necessary to deconstruct it and respond individually:

  • “Oracle’s decision to ship the compiler aware that the known defects…” According to the post that went out to the Apache Solr mailing list (seen quoted in a blog post), “These problems were detected only 5 days before the official Java 7 release, so Oracle had no time to fix those bugs… .” I’m sorry, folks, but five days before the release is not a “known defect”. It’s a late-breaking bug. This is yellow journalism, if you ask me.
  • “Given that Java 7 took five years to see light…” Much of that time being the open-sourcing of the JDK itself (1.5 years) and the Oracle acquisition (1.5 years), plus the community’s wrangling over closures that Sun couldn’t find a way to bring consensus around. Remember when they stood on the stage at Devoxx one year and promised “no closures” only to turn around the year following at the same conference and said, “Yes closures”? Sun' had a history of flip-flopping on commitments worse than a room full of politicians. Slapping Oracle with the implicit “you had all this time and you wasted it” argument is just unfair.
  • “… it seems to me and many others that Oracle could have waited a bit longer to fix the bug before releasing the software.” First of all, what “many others”? Remember when Sun proposed the “Java7 now with less features vs Java7 later with more features” question? Overwhelmingly, everybody voted for now, citing “It’s been so long already, just ship *something*” as a reason. If Oracle slipped the date, the howls would still be echoing across the hills and valleys, and Andrew would be writing, “If Oracle commits to a date, they really should stick with this date…” But secondly, remember, the bug was noticed five days before the release. Those of you who’ve never seen a bug show up during a production deployment roll out, please cover your eyes. The rest of you know good and well that sometimes trying to abort a rollout like that mid-stream causes far more damage than just leaving the bug in place. Particularly if there’s a workaround. (Which there is, by the way.)
  • “To a large extent, there is a feeling in the Java community that Oracle does not understand Java.” Hmm. Not surprising, really, when pundits continually hammer away how Oracle doesn’t get Java and doesn’t understand that everything should be given away for free and when people bitch and complain you should immediately buy them all ponies and promise that they’ll never do anything wrong again…. Seriously? Oracle doesn’t understand Java? Or is it that Oracle refuses to play the same bullshit game that Sun played? Let’s see, what is Sun’s stock price these days? Oh, right.
  • “I would have expected it to understand enterprise software enough…” And frankly, I would have expected an editor to understand journalism enough to at least attempt a fair and unbiased story. It’s disappointing, really. Andrew has struck me as a pretty nice and intelligent guy (we’ve chatted over email), but this piece clearly falls way short on a number of levels.
  • “… not to ship a compiler with defects that hang a valid program.” Let’s get to the next paragraph to get into this one.

Andrew’s next paragraph reveals some disturbing analysis:

The problem, from what is known so far, derives from a command-line optimization switch on the Java compiler. This switch incorrectly optimized loops, resulting in the various reported errors. In Java 7, this switch is on by default, while it was off by default in previous releases. Regardless of the state of the switch, the resulting optimizations were not tested sufficiently.

This is a curious problem, because compilers are one of the most demonstrably easy products to test. Text file, easily parsed binary file out. Or earlier in the compilation process: text file in, AST out. The easy generation of input and the simple validation of output make it possible to create literally tens of thousands of regression tests that can explore every detail of the generated code in an automated fashion. These tests are known to be especially important in the case of optimizations because defects in optimized code are far more difficult for developers to locate and identify. The implicit contract by the compiler is that going from debug code during development to optimized code for release does not change functionality. Consequently, optimizations must be tested extra carefully.


Actually, no, the problem, according once again to the Solr mailing list entry, is with the hotspot compiler, not with the compiler itself. Andrew demonstrates a shocking lack of comprehension with this explanation: JIT compilation is nothing like traditional compilation (unless you hyperfocus on the optimization phases of the traditional compiler toolchain), and often has nothing to do with ASTs and so forth. In short, Andrew saw “compiler” and basically leapt to conclusions. It’s a sin of which I’m guilty of as well, but damn, somebody should have caught this somewhere along the way, including Andrew himself—like maybe contacting Oracle and asking them to explain the problem and offer an explanation?

Nah, it’s much better (and gets DDJ a lot more hits) if we leave it the way it’s written. Sensationalism sells. Hence my title.

And, it turns out, if they’re optimizations in the JITter, they can be disabled:

At least disable loop optimizations using the -XX:-UseLoopPredicate JVM option to not risk index corruptions.

Please note: Also Java 6 users are affected, if they use one of those JVM options, which are not enabled by default: -XX:+OptimizeStringConcat or -XX:+AggressiveOpts


Oh, did we mention? It turns out these optimizations have been there in Java 6 as well, so apparently not only is Oracle an idiot for not finding these bugs before now, but so is the entire Java ecosystem. (It seems these bugs only appear now because the optimizations are turned on by default now, instead of turned off.)

Andrew continues:

But even if Oracle's in-house testing was not complete, I have to wonder why they were not testing the code on some of the large open-source codebases currently available. One program that reported the fatal bug was Apache Solr, which most developers would agree is a high profile, open source project. Projects such as Solr provide almost ideal test beds: a large code base that is widely used. Certainly, Oracle might not cotton to writing UATs and other tests to validate what the compiler did with the Solr code. But, in fact, it didn’t have to write a test at all. It simply needed to run the package and the SIGSEGV segmentation fault would occur.


Oh, right. With the acquisition of Sun, Oracle also inherited a responsibility to test their software against every open-source software package known to man. Those people working on those projects have no responsibility to test it themselves, it’s all Oracle’s fault if it all doesn’t work right out of the box. Particularly with fast-moving source bases like those seen in open-source projects. Hmm.

I have to hope that this event will be a sharp lesson to Oracle to begin using the large codebases at its disposal as a fruitful proving ground for its tools. While the sloppiness I've discussed is disturbing, it's made worse by the fact that the same defects can be found in Java 6. The reason they suddenly show up now is that the optimization switch is off by default on Java 6, while on in Java 7. This suggests that Sun's testing was no better than Oracle's. (And given that much of the JDK team at Oracle is the same team that was at Sun, this is no surprise.) The crucial difference is that Oracle knew about the bugs prior to release and went ahead with the release anyway, while there is no evidence Sun was aware of the problems.


I have to hope that this even won’t be a sharp lesson to Oracle that the community is basically made up of a bunch of whiny bitches who complain when a workaroundable bug shows up in their products. Frankly, I would.

Did we mention that all of this was done on an open-source project? At any point anyone can grab the source, build it, and test it for themselves. So, Andrew, are you volunteering to run every build against every open-source project out there? After all, if this is a “community”, then you should be willing to donate all of your time for the community’s benefit, right? Where are the hordes of developers willing to volunteer and donate their time to working on the JDK itself? You’re all quite ready to throw rocks at Oracle (and before that, Sun), but how many of you are willing to put down the rock, pick up a hammer, and start working to build it better?

Yeah, I kind of thought so.

Oracle's decision was political, not technical. And here Oracle needs to really reassess its commitment to its users. Is Java a sufficiently important enterprise technology that shipping showstopper bugs will no longer be permitted? The long-term future of Java, the language, hangs in the balance.


Unless you were in the room when they made the decision, Andrew, you’re basically blowing hot air out your ass, and it smells about as good as when anyone else does. This is a blatantly stupid thing to say, and quite frankly, if Oracle refuses to talk to you ever again, I‘d say they were back to making good decisions. You can’t responsibly declare what the rationale for a decision was unless you were in the room when it was made, and sometimes not even then.

Worse than that, the Solr mailing list entry even points out that Oracle acknowledged the fix, and discussed with the community (the Solr maintainers, in this case, it seems) when and how the fix could come out:

In response to our questions, they proposed to include the fixes into service release u2 (eventually into service release u1, see [6]).


Wow. Oracle actually responded to the bug and discussed when the fix would come out. Clearly they are unengaged with the community and don’t “get” Java.

Maybe I should rename this blog’s title to “Sloppy Work at Dr Dobb’s Journal”.

Nah. Sensationalism sells better. Even when it turns out to be completely unfounded.

References
Published at DZone with permission of Ted Neward, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Guido Amabili replied on Sat, 2011/08/06 - 12:52pm

I didn't read fully the article but aren't five days enough to take the decision, and stand to it, to delay or not a software delivery ?

I am rather tired of Oracle cause we have found a bug in the system we are working due to a bug in their stored procedure parser and compiler(Oracle 11g) which is not able to compile and execute properly nested loops...... 

Mike James replied on Sat, 2011/08/06 - 1:19pm in response to: Guido Amabili

I agree and would add that the entire article is bad tempered and nit picking.

It borders on libel.

David Whatever replied on Sat, 2011/08/06 - 2:49pm

Five days is not enough time to fix the issue. It is more than enough time to make a decision about whether a release candidate should become final. At very least, it is time to document as known issues with workarounds in the release notes.

Oracle has also given lower priority to fixing the issues from what I understand; they may not be fixed in the first update either.

These valid concerns about code breaking under default optimizations, as well as the lack of any ETA, make this an extremely anticlimactic and lackluster release. I expect this to lead people away from evaluating Java 7 support in their products, as now people will not be sure when the "released" version of Java will be truly ready.

And stop giving Oracle a pass because developers weren't testing apps with internal (and marked as unsupported) optimization flags or pre-release-candidate versions of OpenJDK. There was no expectation established that was needed of developers. The projects which reported this issue were testing once the JRE was in release-candidate status, which is the intended purpose of release candidates.

Reza Rahman replied on Sat, 2011/08/06 - 9:31pm

Ted,

Thanks so much for having the courage to inject some sanity into the needless histrionics! I agree with you a 100%...

Cheers,

Reza

Fab Mars replied on Sun, 2011/08/07 - 2:35am

I have to hope that this even won’t be a sharp lesson to Oracle that the community is basically made up of a bunch of whiny bitches who complain when a workaroundable bug shows up in their products. Frankly, I would.

 Totally my point.

All of this is blahblahblahblah.

 

Igor Laera replied on Sun, 2011/08/07 - 9:05am

I would accept the load on the software creators only if Java would be like Linux.

But it isn't. Oracle owns it, Oracle controls it, Oracle sues people with it. Then it's their job. Microsoft does that. They have whole datacenters filled with software to test before they release a larger Service Pack. That's the reason that company makes so much dough since decades: they see it as their part of the JOB.

If Oracle thinks its too much pain to do that in front, they should offer the servers to the teams to setup own testing environments. They have something like 100 datacenters around the world. The have an 'unbreakable' Ripoff Linux distribution. Its not like they don't know how. They simply don't want to.

Its funny that they are people who think that Oracle can have all the good stuff around Java - and simply relay/outsource/drop the "challenging" stuff on the people who use it.

Whats next? Asking the community to write the API docs, because its too much work?

Reza Rahman replied on Sun, 2011/08/07 - 10:30am in response to: Igor Laera

Last I checked, the JCP defines Java, OpenJDK is an open source project, the JDK is 100% free and a lot of other people people besides Oracle make money from Java :-).

Anthony Ve replied on Sun, 2011/08/07 - 12:10pm in response to: Reza Rahman

I admire the way you keep trying to talk some sense into the followers of the "Oracle is evil" cult :-)

Igor Laera replied on Sun, 2011/08/07 - 2:59pm in response to: Anthony Ve

You mean "those people" who performed the Jenkins and LibreOffice forks, because they didn't want to have all the work - but no control, no say in important things? I love this sort of simplicity and obedience. Fortunately others don't.

Reza Rahman replied on Sun, 2011/08/07 - 10:08pm in response to: Anthony Ve

Thanks for the very kind words.

It's nothing specific to Oracle (I probably know better than many others the areas that Oracle really does need to do better).

I try to speak my mind openly on what I think is right (provided I know enough about the topic at hand).

I believe a lot of the anti-Oracle sentiment is rooted in unfamiliarity, ignorance, over-generalized stereotyping and an unfounded distrust/fear (perhaps added with some agenda pushing and good old fashioned jealousy). The difference I guess is that I do regularly interact with good people working inside Oracle that do genuinely mean well.

In case of LibreOffice/OpenOffice and Hudson/Jenkins - my take on it has been that both of those situations are very murky where separating the "good guys" from the "bad guys" isn't really that cut and dry -- as is the Java/Android situation...

Cay Horstmann replied on Sun, 2011/08/07 - 11:49pm

I am not sure that the "everyone is beating on poor Oracle" meme captures the nuances of this issue.

When you look at the release notes at http://www.oracle.com/technetwork/java/javase/jdk7-relnotes-418459.html#knownissues, one issue does stand out over the usual "grief with Java plugin" or "grief with CJK input" or "grief with weird X11 issues", namely that an optimization bug can sometimes silently deliver the wrong result. 

People don't like to get the wrong result, even if the probability is less than them getting struck by lightning. We all know that because it's happened before--remember the Pentium bug?

 So, if Oracle had said "Whoa, we just found out this issue, and we'll fix it immediately, but it'll take ten days to re-run all the acceptance tests", nobody would have batted an eye. Or if they had said "Look, we've got this problem, but you've got to ship sometime, and here is how you work around this vexing issue", people would have been ok. Their problem was to say nothing at all. If you say nothing at all, you open the floodgates to "Don't use Java 7 if you use loops", "Java 7 unsafe at any speed", "Sloppy work a Oracle", and all the other hyperbole. That's a lesson well learned for anyone who needs to make a similar decision.

Andrew McVeigh replied on Mon, 2011/08/08 - 11:36am in response to: Cay Horstmann

People don't like to get the wrong result, even if the probability is less than them getting struck by lightning

I think you've summed it up very neatly and concisely. Putting the name-calling aside, getting the wrong result silently needs to be handled differently from other bugs which can be clearly/easily detected. I felt the same way about getting my Pentium replaced at the time, regardless of the probability of occurrence (I was doing numerical work at the time).

I also think that it is sad that the tone of the wider debate seems to have degenerated into 2 camps: (a) oracle is bad or (b) oracle is good. Surely this is just a process improvement that needs to happen.

Reza Rahman replied on Mon, 2011/08/08 - 12:58pm in response to: Andrew McVeigh

You are correct in that Oracle is neither "good" nor "evil" and that this is an issue of process improvement, including better and more frequent communication.

Andrew Binstock replied on Mon, 2011/08/08 - 7:19pm

Cay, AndrewM, and Reza: All correct. I am not part of the "Oracle is bad" crowd. Oracle stubbed its toe on this, and I took them to task for their decision to ship the product. There was and is no sensationalism in my editorial and I took no particular satisfaction in writing it. Ted has his own personal agenda and his own hyperbolic way of speaking ("slut"? Please!). He's a good lecturer and often has interesting technical things to say. His political views of technology are entirely up to him, but I am surprised to see him defend the release of software with known segfaults, and then bashing people who complain about this. Dr. Dobb's, however, will continue to discuss poor practices in the release of enterprise software, whether from Oracle, Microsoft, IBM or any of the many other vendors in that space.

Andrew McVeigh replied on Tue, 2011/08/09 - 4:38am in response to: Andrew Binstock

Ted has his own personal agenda and his own hyperbolic way of speaking ("slut"? Please!)

Agree that usage of the word slut is completely ridiculous (and offensive) in this context.

Steve Mcjones replied on Tue, 2011/08/09 - 10:32am

What a douche. The problem is not that there are bugs, the problem is that Oracle knew about them and decide to ignore them.

 

The could have:

  • Put some information into the release notes,
  • Delay the release and fix the bugs,
  • Turn off the compiler switch, reverting back to settings well-tested during Java6.

They have done nothing, because marketing and politics was just more important.

 

And now they get the deserved beating. Nothing wrong with it!

Reza Rahman replied on Tue, 2011/08/09 - 12:36pm in response to: Steve Mcjones

Take a look here: http://weblogs.java.net/blog/fabriziogiudici/archive/2011/08/02/worried-about-java-7-go-hudson-or-jeskins. One of the bugs was filed as low-priority and hence presumably worth the risk since it happens very infrequently. The other two were discovered just a few days before the release. All are assigned high priority now and are getting fixed. The only real question here is whether the most sound judgement call was made under the circumstances.

I do hope the OpenJDK team realizes things have gone too far for them not to present their side of the story themselves soon...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.