I am an experienced software development manager, project manager and CTO focused on hard problems in software development and maintenance, software quality and security. For the last 15 years I have been managing teams building electronic trading platforms for stock exchanges and investment banks around the world. My special interest is how small teams can be most effective at building real software: high-quality, secure systems at the extreme limits of reliability, performance, and adaptability. Jim is a DZone MVB and is not an employee of DZone and has posted 99 posts at DZone. You can read more from them at their website. View Full User Profile
Still getting my head around Continuous Deployment
In a webinar on CD, Kent
Beck explored a fundamental mismatch between rapid cycling in design
and construction, and then getting stuck when we are ready to deploy. He
argues that that queuing theory and experience show that there is more
value in a system when all of the pipes are the same size, and follow
the same cycle times. Ideally, there should be a smooth flow from ideas
to design and development and to deployment, and then information from
real use fed back as soon as possible to ideas. Instead we have a choke
point at deployment.
Then there is the ROI argument that we can
get faster return on money spent if we deploy something that we have
done as soon as it is ready.
Kent Beck also explained that based
on his experience at one company the constraints of deploying
immediately make people more careful and thoughtful: that the practice
becomes self-reinforcing, that developers stop taking risks because they
don’t have time to. Essentially problems become simpler because they
have to be.
Timothy Fitz presented a Deployment Equation:
If Information Value + Direct Value > Deployment Risk then Deploy
idea is that Continuous Deployment increases information value by
giving us information earlier. He talked about ways to reduce risk:
Rolling out larger changes slowly to customers, through dark launching
(hiding the changes from the front-end until ready: not exactly a new
idea) and enabling features for different sets of users. - Extensive automated testing, supplemented with manual exploratory testing before exposing dark-launched features. -
Ensuring that you can detect problems quickly and correct them through
production monitoring, looking for leading indicators of problems, and
instant production roll back. - An architecture that supports
stability through isolation. Follow the patterns in Release It! to
minimize the chance of “stupid take the cluster out” errors. - Locking down core infrastructure, preventing changes from certain parts of the system without additional checks.
Jez Humble at ThoughtWorks presented on Continuous Delivery:
building on top of Continuous Integration to automate and optimize
further downstream packaging and deployment activities. Continuous
Deployment is effectively an extension of Continuous Delivery. It was
mostly a re-hash of another presentation that I had already seen from
ThoughtWorks, and of course there will be a book coming out soon on all
Some questions on Continuous Delivery and Continuous Deployment
Continuous Delivery is based on the assumption that you can get
immediate feedback: from automated tests, from post-deployment checks,
from customers. How do you account for problems that don't show up
immediately, by which time you have deployed 50 or 100 or more changes?
from Timothy Fitz: The first time, you revert and re-push. Then you
post-mortem and figure out how to catch faster by looking for a leading
indicator. Performance issues can be caught by dark launching, in which
case turning off or reverting the functionality will have 0 visible
effect. Frontend issues are usually caught by A/B tests, where you can
mitigate risk by not running them at 100% of all traffic (have 80%
control, 20% hypothesis, etc)
Me: Followup on my question about
handling problems that show after 50 or 100 changes. The answer was to
revert and re-push - but revert what? A problem may not show itself
immediately. How do you know which changes or changes to rollback?
from Timothy Fitz: If it took 50-100 changes, then you'll be finding
the change manually. It turns out to be fairly easy even if it's been
48-96 hours, you're only looking through a few hundred very small
commits most of which are in isolated areas unrelated to your problem.
Me: How to you handle changes to data (contents and/or schema) on a continuous basis?
not answered. Jez Humble talked about writing code that could work with
multiple different database versions (which would make design and
testing nasty of course), and how to automate some database migration
tasks with tools like DBDeploy, but
admitted that “databases were not optimized for Continuous Delivery”.
There were no good answers on how to handle expensive data conversions.
My team has obligations to ensure that the software we deliver is
secure, so we follow secure SDLC checks and controls before we release.
In Continuous Delivery I can see how this can be done along the
pipeline. But secure Continuous Delivery?
Answer from Jez Humble:
Ideally you'd want to run those checks against every version. If you
can't do that, do it as often as you can. [I didn’t expect a meaningful answer on this one, and I didn’t get one]
Somebody else’s question: Do you find users struggling to keep up and adapt to the constant changes?
from Kent Beck: In practice it doesn't seem to be a problem usually
because each change is small--a new widget, a new menu item, a new
property page that's similar to existing pages. A wholesale change to
the UI would be a different story. I would try to use social processes
to support such a change--have a few leaders try the new UI first, then
Somebody else’s question: Without solid continuous
testing in place, CD is [a] fast track to continuous complaints from end
Answer from Timothy Fitz: Not always, but usually. For the
cases where it makes sense (small startup, or isolated segment that
opts-in to alpha) you can find user segments who value features 100%
over stability, and will gladly sign up for Continuous Deployment.
So what do I really think about Continuous Deployment
OK I can see how Continuous Deployment can work,
If: your architecture supports isolation, that it is horizontal and shallow, offering features that are clearly independent;
you don’t follow the all-or-none approach – that you recognize that
some kinds of changes can be deployed continuously and some parts of the
system are too important and require additional checks, tests, reviews,
and more time;
If: you build up enough trust across the company;
your customers are willing to put up with more mistakes in return for
faster delivery, if at least some of them are willing to help you do
your testing for you;
If: you invest enough in tools and
technology for automated layered testing and deployment and
post-deployment checking and roll-back capabilities.
Deployment is still an immature approach and there are too many holes in
it. And as Kent Beck has pointed out, there aren’t enough tools yet to
support a lot of the ideas and requirements: you have to roll your own,
which comes with its own costs and risks.
And finally, I have to
question the fundamental importance of immediate feedback to a company. I
can see that waiting a year, or even a month, for feedback can be too
long. I fully understand and agree that sometimes changes need to be
made quickly, that sometimes the windows of opportunity are small and we
need to be ready immediately. And there’s first mover advantage, of
course. But I have a hard time believing that any kind of changes need
to be continuously made 50 times per day: that there are any changes
that can be made that quickly that will have any real difference to
customers or to the business. And I will go further and say that such
rapid changes are not in the interests of customers, that they don’t
need or even want this much change this fast. And that I don’t believe
that it’s really about reducing waste, or maximizing velocity or
increasing information value.
No, I suspect it is more about a
need for immediate satisfaction – for programmers, and the people who
drive them. Their desire to see what they’ve done get into production,
and to see it right away, to get that little rush. The simple inability
to delay gratification. And that’s not a good reason to adopt a model
Published at DZone with permission of Jim Bird, author and DZone MVB. (source)
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)