Canary Testing - Using Blue-Green Deployments and Feature Flags

What is Canary Testing?

Canary testing is a technique for reducing risks associated with new feature releases in production. It basic premise is that new features or changes should be rolled out gradually, initially to a small subset of users, before opening them up more broadly. This is in contrast to big-bang releases where everything is released at once to all users (hint: it is dangerous!)

Because the change is only exposed to a small number of users initially, if bugs or other issues are discovered, it only impacts a small subset of users. This allows software teams to validate their changes before rolling it out to all users.

Canary testing is a popular best practice in the DevOps community and is very common in large companies.

Origins of Canary Testing

The origins of Canary testing may sound a little too harsh. You may have heard of the phrase “canary in a coal mine”. In this old practice, miners carried caged canary birds to check the toxicity of odorless yet deadly gases that may be present in mines.

The birds having lower tolerance to toxic gases, react faster to it. If the bird die, it means that toxic gases are present and its time for miners to get out. Their low tolerance made it possible to test the mine’s condition.

In software, canary testing to detect for errors when releasing new changes need not be not frightful. Rather, it’s a very useful technique that should be used to reduce the risk and shielding the majority of end users from bugs and other issues that may degrade user experience.

How to do Canary Testing?

Canary testing is all about serving a version of the software to the user subset before it is launched to a wider audience. The isolated environment gives the developer team the necessary bubble for rectifying issues without being exposed to a larger audience.

There are two ways to do canary testing.

1. Blue-Green Deployments

The first way to do canary testing is by using blue-green deployments. Using this technique, you divide the traffic at the server-level. This way, changes or new features are served gradually to your users. Only a few users get access to the newer version providing the necessary isolation to test out new features.

To do this, first create a new production environment. This is called the “green” environment while the existing production environment is called “blue” environment. Then deploy your new application version to the “green” environment.

Start routing a small number of users such as 2% to the “green” environment. The users are selected randomly. As you gain more confidence, you start sending more users to the “green” environment. Once 100% of users are going to the “green” environment, you can remove the “blue” environment or keep it around so it can act as the “green” environment the next time you are releasing features.

2. Using Feature Flags to Gradually Rollout Changes

Feature flags provide an alternate way to do canary releases which is more powerful and sophisticated than blue-green deployments we discussed.

Canary features using feature flags offer many advantages over blue-green deployments:

With feature flags, you can launch features separately and control their roll out status independently. With blue-green deployments, if you are routing 10% of the traffic to the “green” environment, all features on the environment are getting 10% traffic. With feature flags, you can launch a feature A at 5% and feature B at 2%.
Feature flags allow you to choose users using a sophisticated approach that takes into account their attributes or other properties. For example, you can launch a new feature only to users who have joined in the last 2 months.
Feature flags allow developers to launch new features without the overhead of creating and managing a second production environment.
Blue-green deployments are typically controlled by DevOps or Operations teams whereas feature flag based releases are controlled by developers or feature implementors directly.
Similarly, developers can easily increase the percentage of users who are getting the feature easily using a dashboard without impacting other features.
Feature flags enable seeing metrics, KPIs such as latency at the feature level easily. You can do this with blue-green deployments, but it requires additional work.
Managing separate cluster (“green”) is an overhead and not all companies have the budget and resources for it.

Canary Testing using Unlaunch

Software developers and teams use Unlaunch for canary releases using feature flags. Unlaunch makes it easy to do canary releases and allows developers to target their users by their attributes and/or gradually roll out.

Roll features out easily and safely using Unlaunch dashboard. View metrics and KPIs to see how your feature is performing and collaborate with your team.

You can follow in the footsteps of top software companies like Facebook and Google and use feature flags to roll out new features to a small percentage of users first before rolling them out more broadly. Sign up for a free account and start using Unlaunch feature flags today.

Author: This blog was written by Umer Mansoor and Nitish Surana. Nitish is a content creator for Unlaunch.