Continuous release process

Fabian Wesner
Fabian Wesner CTO Spryker
29. June 2016 in

Technology English

During the last years I have seen several different approaches to manage GIT branches and deployments. Most people tried Gitflow and found out that this does not fit their needs.

In this article I want to describe a continuous release process that worked great for us in the past. The idea is not new or very innovative but simple to operate and very powerful. Conceptually the model is close to the continuous delivery process and its main goal is to enable a high output development process.

Branching model

Every feature gets its own feature-branch, e.g. feature/101-cms-improvements. You can use any kind of naming convention for the branches, but I highly recommend to be consistent. It often helps to have the number of the related ticket in the branch name. All feature-branches are created from the master-branch. Now you get something like this:

master
feature/100-new-discount-engine
feature/101-cms-improvements
feature/102-checkout-step-engine

Sync often, sync early

To avoid a merge mess all developers should (~must) regularly merge or better rebase all changes from the master into their branch. Sometime I see developers, who write their code in total isolation. They commit to a branch but do not synchronize it with the master. This often results in a painful time when tons of merge conflicts need to be resolved (We call this The Napalm Merge). I highly recommend to define team conventions, that enforce everybody to keep the branches in sync with the master.

QA and integration tests

The feature branches are used for development, code reviews and testing. You can deploy it to a QA environment and let the testers do their job. When the feature is finally done, the developer can open a pull request and let others review the change set. You can do this before the QA, in parallel or afterwards. When the PR got some thumbs up from other developers, the feature is ready for the release and you can start with the final integration tests. This is a critical phase. You must make sure, that the feature branch contains the complete feature and the current state of the master branch. During the integration test, the master must not be changed anymore! The feature is already tested and reviewed, so in a perfect world, all you need to do is to wait for your continuous integration system to run all automated tests before you can start the deployment. This should not take more than a few moments. The purpose of this final test is to make sure, that there are no side-effects between the several features which are developed in parallel. In reality a lot of projects also need to deploy the feature to a special staging environment do perform compatibility checks with other systems which do not exist in other testing environments (e.g. ERP or PIM).

Release it!

As soon as this is succeeded, you perform the following steps:

  •   Close the PR, merge the feature branch to the master and delete it. 
      I highly recommend to use GitHub’s workflow and not mess up with manual merges or rebases.
  •   Create a release tag and deploy it to production.

In the beginning you can do this manually but for bigger projects it is a good idea to invest into automation.

Hotfixes

Hotfixes are fixes for critical bugs that appear in the production environment. They are always urgent and must be solved right away. In the continuous release branching model, there is no difference to any feature branch. You create a branch like “hotfix/no-prices-are-shown”, fix the problem, merge it to master and deploy it. The other workflow will of course be different. Usually there is no code review or intensive QA session.

Clean master

It is very important to make sure that nobody commits directly into the master branch. Only fully tested branches are merged into the master. It is also recommended to protect the master.

Tags and rollbacks

Each tag represents a single feature or a hotfix that was deployed to production. When a feature fails in production, you can safely roll back to the last tag. You just need to be very careful when you do irrevocable changes. For instance when a deployment triggers a huge change of the database schema. In this special case, the rollback is not possible, so you better create a backup before.

Btw: It does not really matter how you name your tags. You can just count them up like this: tags/2016–0001. The only important thing is that you quickly find the previous deployment in case of a rollback.

Summary

The following picture summarizes the process for a single feature branch. Each circle represents a commit. The dashed circles represent commits from developers while the others are merges.

Blogpost_continuous-release-process.png 

Now lets see what happens when there are several branches involved. The changes in the master are actually the result of other released features. Every commit into the master branch represents a release. They need to be synchronized into the other active branches.
blogpost_continuous-release-process2.png

Why is the continuous release process a good idea?

When you read about the branching model, you may ask yourself, why this should be a good idea. The most important aspect is what you do not do! You do not put several features into a deploy branch as Gitflow proposes. The concept of the deploy branch is derived from the traditional Scrum process. In Scrum you implement deployable increments in iterations. Your team develops one or more features per sprint and provides them in one big release. In this scenario it makes a lot of sense to collect several features into the deploy branch, test everything in a dedicated release branch and release it together. From my experience this process works well for some teams, but there are flaws which reduce the productivity. Let’s imagine the team implemented three features: A, B and C. Feature A is done after a few days and merged into develop branch. B and C are also done during the sprint and merged into the develop branch. Now someone creates a release branch and asks the QA team to test it. Let’s say feature A and B are fine, but C has errors. As a result A and B have to wait until C is fixed. Even worse, no other feature can pass by, because the develop branch is dirty now. Instead of pushing features to production early and earning money, you keep them in branches…

The branching model proposed in this post does not have this problem. Each feature gets its own phase of development, reviewing, testing and final integration tests. There is no annoying waiting time between the steps. Nothing blocks each other and features are deployed as soon as they are done.

The total risk of a single deployment is much lower now because the amount of changes is smaller and every bug can be assigned to a specific release.

Team deployments

Most companies that I know have a small group of people who are allowed to execute deployments to production environment. I want to propose an alternative to this.

Let’s say there are 30 developers, divided into six feature teams. Each team takes care for one or more features and implements them through all layers of the application. This way you have several development streams and all features need to be deployed. When you perform the described continuous release process and don’t prepare a good deployment workflow, you’ll quickly end up with a release waiting queue. Meanwhile developers start new features and don’t care anymore for the old stuff they did yesterday.

For this reason it is very important to scale your deployment workflow. The technical process of a deployment should be fast. It does not really matter if you have a “button” to deploy a branch or someone needs to ssh to the server and run a script, as long as the whole process is quick.

I have seen two approaches to scale the deployment workflow. The traditional way is to have a dedicated team of DevOps. Whenever there is a release waiting queue, this team needs to be extended. This is not a funny job and there is an alternative.

The modern way is to let your developers deploy their stuff by themselves. This approach has some appealing advantages. First of all you guarantee that the responsible developer is available during and after the deployment, so he can immediately provide a hotfix if needed. Second there is a high incentive to not break it up. I have seen people becoming very nervous when they are responsible for the deployment. For sure they double-check that everything works well before they push it to production. And thirdly this way you can shorten the QA phase. Why? Because the developer can not simply throw the feature over the wall and forget about it. She/he needs to get it back fast otherwise there may be merge conflicts because other features are deployed earlier. So she/he intuitively supports the testers to get it done as soon as possible. I have seen people discussing about the order of releases because nobody wants to be the last one who needs to merge the changes.

In case you want to try this approach, you should define some guidelines with the team. I want to give you three recommendations:

(1) When you perform continuous releases, it is absolutely mandatory to have a comprehensive application monitoring in place. My recommendation is to use New Relic. We always defined our main KPIs (key-performance-indicators) and showed them on a dashboard. In e-commerce this is the execution time and error rate of all applications and the number of requests and orders (or carts) per minute. We automatically triggered alerts when the website was down for more than 30 seconds. To make the whole team aware of this, we usually showed the graphs on a big monitor in the office.

(2) The deploying developer needs to stay for two hours after the deployment. She/He is “done” when there are no new exceptions, the execution time is unaffected (or better) and the other indicators are healthy (e.g. the number of orders per minute is the same). And of course, don’t deploy on Friday afternoon.

(3) You need to define a gatekeeper to guarantee that there is only one deployment at a time and there is enough time between two releases. We did this with a hat in the office, where the developer needed to take it to be able to release. Obviously this does not work with a remote team, but I am confident you’ll find something similar.

What about software vendors?

The described process works very well with websites where you deploy changes to a production server. I recommend it to all our clients who run transactional business models (most of them are shops).

At Spryker we play a different game. We are a software vendor, so we regularly provide software releases for clients. So conceptually there is nothing like a deployment to production. In the past we tried Gitflow because we thought this makes sense here and all traditional software vendors use it. But we proved ourselves wrong. Neither for us nor for our clients there is a benefit from combining several features into one big release.

Ten years ago, there was no way to deliver continuous updates to customers, because you needed to ship it on CD. But today there are other possibilities. Some months ago we switched to the continuous release which is described above and extended it with several techniques like subtree split and semantic versioning. As a result our clients retrieve a stream of small well-defined releases, instead of a big change set every few weeks. We call this process “Atomic Releases” and will talk about in another blog post.

Still got questions?
Ask the author for further information.