The philosophy of Continuous Deployment

Over the past few years my team have been on a journey toward Continuous Deployment – this has involved changing our architecture, our attitude towards testing, and our tolerance toward slow build times but most fundamentally we have noticed it required a change in philosophy and our attitude toward quality, responsibility and ownership.

In our previous model our mantra was “you release what you test” and with the right phase gates and environments in place it allowed new releases to flow through the hands of QA and into the responsibility of Production Operations. Despite having a huge body of automated tests, our attempts to refine this process to become faster always foundered:

  1. we could patch additions into the release branch but this delayed the release while QA assured themselves we were ‘ready’

  2. any late patches or bug fixes to the release branch had to be merged back into the Master branch and were sometimes forgotten causing regressions on the next release

  3. the longer the delay to the branch release the more the product team tried to shoehorn more stuff into the release branch instead of Master and the vicious cycle continued

  4. meanwhile QA were always absorbed getting the branch release out and work on Master was delayed or quality suffered

A recent pivot allowed us to become far more aggressive toward enforcing a strict “no junk in the trunk (master)” policy and we found that when a team embraces Continuous Deployment, they start to embrace a philosophy of personal responsibility and develop the following behaviours:

  • ownership of services or code bases

  • collaboration with the operations team (devops)

  • making sure their code is well-tested and production ready

  • releasing code in small releasable chunks

  • thinking about backwards compatibility

  • collaborating and communicating with your team and the business users that your code changes will affect

  • accepting that you cannot live in isolation on your long-running branch without paying the price in conflicts

  • realising that the QA team are not your bug finding monkeys

  • checking your production logs for errors

  • writing the correct level of automated tests to limit regressions without negatively impacting productivity

  • monitoring your systems vital statistics (in all environments)

  • deciding when to back-out and when to fix-forward (and having a plan for this)

  • having a nuanced understanding of risk, quality assurance, testing and release strategies and that some features and bugs can be released immediately while others need in-depth testing

  • understanding that you no longer need a “definition of done” because your work isn’t done until it’s successfully released into a stable production environment.

All this comes under the umbrella of personal responsibility and that comes from the knowledge that when a developer chooses to release a piece of code, that code will go to production and they will accept responsibility for it. Along with that responsibility is the pride and reward in knowing that one’s code is released – it’s live, in the wild, and changing users lives (hopefully for the better).

Recently we experimented with adding a traditional formal sign-off to our release process from business owners outside the product team and the results have been interesting. We were faced with either applying a moratorium to Master while sign-off was approved or creating a release branch and letting Master flow. Either way we were faced with similar issues:

  1. Developers are no longer able to release their own code so Master starts to differ significantly from what is in production.

  2. A “one size fits all” approach is used for release sign-off – whether it’s fixing a typo or refactoring your payment system you still need sign-off as developers are no longer trusted to weigh the risk themselves.

  3. The QA team and release manager become responsible for the release, not the individual that wrote the code. The theme of personal responsibility is broken: the developer has moved onto a new task.

  4. The release manager starts to manage what goes into the release: features and fixes are forced to languish on branches until the release manager is ready to merge them.

  5. As the amount of code ready to deploy grows, release managers get nervous and need more time to test on the release branch.

  6. More and more time is spent managing ‘the release’ than writing and delivering features

  7. As stakeholders realise that releases take longer, they start to hold up the release for “one last fix” rather than wait another week. A full regression is then required by a nervous release manager and the vicious cycle continues.

  8. When you push out your release it has batches of changes in it. If production is affected, it is far more difficult to work out which change caused the problem.

The end result: your release cycle is slower, your team is less productive and less engaged, your QA team is over-loaded and your releases are potentially more buggy and harder to fix.

Continuous Deployment is not just about releasing code fast – it’s about having a team that takes pride in their work and feels responsibility for the quality, stability and effectiveness of the live product.

Our focus now is on improving visibility, accountability and automation so we can better provide rich, descriptive release notifications to stakeholders and regain the capacity to release from within the team rather than have to ask permission from an approval body from outside the team.

How Agile would have saved project ORCA

Working in the Web Development industry it’s easy to get the impression that everyone involved in software knows about Agile and is at least paying it lip service even if their processes are not really Agile in practice. Clearly this is not the case. The Unmitigated Disaster Known As Project ORCA describes how Mitt Romney’s team attempted to win the race to get voters out to the polls using new technology but fell at every hurdle. It presents us with a textbook copy of everything that can go wrong in software delivery with (depending on how significant its impact) potentially election losing consequences for Mitt Romney and the GOP.

The following are some of the glaring errors made that would have been avoided using Agile/Lean techniques:

  • Putting a mission critical piece of software to its first real use on a day when failure was not an option.
  • Giving themselves no opportunity for feedback or pressure testing and gaining no validated learnings as to whether what they had built would even work – conceptually, functionally or practically
  • Using a top-down approach that ignored the huge amount of skills, knowledge and experience that could have come from the greater team (those on the ground that would have to use the product)
  • Convincing themselves they had a product so great that it was better to keep it under wraps and maintain the element of surprise than it was to allow the product to be pressure-tested by the people that were going to use it

Agile software development recognises that you will not get it right in version one, that a delivery team needs to include its end users, that software improves iteratively and that not only your software but your business model itself must be able to pivot and adapt to the realities on the ground and the feedback you receive.

This is what Jason Fried of 37signals did when he gave his company a month to work on projects of their own choosing.

Sounds radical at first thought but if you have dedicated passionate people it makes a lot of sense. It’s the staff involved in the day to day running and building of your business that are most likely to know what your business needs. Fried has given them an opportunity to demonstrate what’s needed and the solutions to those needs.

How can you afford to do this? How can you afford not to? Argues Fried: “We would never have had such a burst of creative energy had we stuck to business as usual.”

Jason Fried – Why I gave my company a month off

Wonderful short video on how to be a Product Owner – should be required viewing for all PO’s and similarly essential viewing for anyone that’s trying to understand the principles of Agile. Captured in this video is the essence of Agile product development. Watch it once and if you don’t understand it all, watch it again. But most importantly, if you are a Product Owner, note the parts about working closely with the team.

Validated Learnings: How Heroku Pivot and Adapt

My team at Westfield.com are always trying to educate the business sponsors to implement the simplest possible thing – to release the minimum viable product and generate validated learnings rather than implement gold-plated visions of what marketing folk think that people want.

Francis Is shows how successful Heroku were at doing this in Heroku’s Early History.

Heroku’s first offering bore little resemblance to what they finally became but it allowed them to pivot and adapt: to drop the offerings that no one cared about (online code editors, no deployments) and concentrate on the stuff they did (github integration, scalability). Early releases discovered what users really wanted and mining that seam of ‘want’ resulted in exponential growth and a $212 million cash buy-out within 3 years.

User stories and asking “Five Whys”

We are currently coaching a new team into the ways of Agile and one of the problems we’ve encountered is getting the devs to write cards or stories using the standard Agile story formats. To Agile newbies the syntactic sugar that surrounds a story’s details often seems like a waste of time. i.e. A developer knows exactly what he means when he writes:

“Add currency code to Data Warehouse views”

and feels like he is being made to jump through hoops to turn that into:

“In order to differentiate between international sales, we need to update the data warehouse transactional views to show the currency code, so that they can report on this data.”

The reason we favour this (or the “As a [user]…” format) on agile projects is that the story describes what needs to be done and why. This means that any member of the team can understand exactly why a story has been added to the backlog and doesn’t need to get an explanation from the person that wrote the story or drill down into the acceptance criteria to discover this.

The other benefit of this format is that it forces the person writing the story to find out exactly why the requester wants that story. Indeed, in drilling down to the real requirement, the analyst may discover that the real business requirement is not what the requester is asking for at all.

Aslak Hellesoy (the creator of Cucumber) illustrates all this perfectly in the cucumber documentation where he describes the process of asking the Five Why’s to discover the underlying requirements – the why in the story.

(shamelessy copied directly from the cucumber wiki for your convenience..

[5:08pm] Luis_Byclosure: I’m having problems applying the “5 Why” rule, to the feature
“login” (imagine an application like youtube)
[5:08pm] Luis_Byclosure: how do you explain the business value of the feature “login”?
[5:09pm] Luis_Byclosure: In order to be recognized among other people, I want to login
in the application (?)
[5:09pm] Luis_Byclosure: why do I want to be recognized among other people?
[5:11pm] aslakhellesoy: Why do people have to log in?
[5:12pm] Luis_Byclosure: I dunno… why?
[5:12pm] aslakhellesoy: I’m asking you
[5:13pm] aslakhellesoy: Why have you decided login is needed?
[5:13pm] Luis_Byclosure: identify users
[5:14pm] aslakhellesoy: Why do you have to identify users?
[5:14pm] Luis_Byclosure: maybe because people like to know who is
publishing what
[5:15pm] aslakhellesoy: Why would anyone want to know who’s publishing what?
[5:17pm] Luis_Byclosure: because if people feel that that content belongs
to someone, then the content is trustworthy
[5:17pm] aslakhellesoy: Why does content have to appear trustworthy?
[5:20pm] Luis_Byclosure: Trustworthy makes people interested in the content and
consequently in the website
[5:20pm] Luis_Byclosure: Why do I want to get people interested in the website?
[5:20pm] aslakhellesoy: 🙂
[5:21pm] aslakhellesoy: Are you selling something there? Or is it just for fun?
[5:21pm] Luis_Byclosure: Because more traffic means more money in ads
[5:21pm] aslakhellesoy: There you go!
[5:22pm] Luis_Byclosure: Why do I want to get more money in ads? Because I want to increase
de revenues.
[5:22pm] Luis_Byclosure: And this is the end, right?
[5:23pm] aslakhellesoy: In order to drive more people to the website and earn more admoney,
authors should have to login,
so that the content can be displayed with the author and appear
more trustworthy.
[5:23pm] aslakhellesoy: Does that make any sense?
[5:25pm] Luis_Byclosure: Yes, I think so
[5:26pm] aslakhellesoy: It’s easier when you have someone clueless (like me) to ask the
stupid why questions
[5:26pm] aslakhellesoy: Now I know why you want login
[5:26pm] Luis_Byclosure: but it is difficult to find the reason for everything
[5:26pm] aslakhellesoy: And if I was the customer I am in better shape to prioritise this
feature among others
[5:29pm] Luis_Byclosure: true!

https://github.com/cucumber/cucumber/wiki/

Valve Handbook

Employee Handbook describing Valve’s completely flat management structure. Not exactly a methodology that you could apply to a big shareholder owned corporate but a great example of how successful you can be if you hire good people and allow them to get on with their job.