Should you automate your kill switches? (And 3 reasons when you shouldn’t)

Today’s products are more complex and fast-paced than ever. With customer’s expectations becoming higher than ever, companies deploy code several times a week. It is important to be ready for any situation and worst case scenarios.

There’s always a risk involved when releasing new features or changes to production. Kill switches help teams manage this risk by making it possible to turn off a feature at the push of a button, or as we’ll explore in this blog post, automatically.

What is a Kill Switch?

A kill switch lets software teams turn off a feature in production with the push of a button,without code deployments. For example, if you are running a news site, you can design a kill switch to hide a published article if it is reported for incorrectness by the users. This enables your business to meet user’s demands and expectations without losing a night’s sleep.

One caveat is that the kill switch must be designed and implemented carefully by developers in order to be useful. This means identifying a backup path in your code to run if the kill switch is activated.

The kill switch option is equally useful in disabling features if a bug can hamper the performance, security or the application’s integrity. In short, you can think of a kill switch as a great way to get away from broken features quickly without it degrading the system quality of service to your users.

How to Automate Kill Switch?

Traditionally, kill switches need to be activated manually by someone, usually a developer. Modern feature flag tools including Unlaunch, come equipped with the option to configure the kill switch to activate automatically if a certain metric meets some threshold. To achieve this, developers need to define a kill switch as a feature flag and assign metrics to it. This lets them configure the kill switch to trigger automatically and disable the feature flag when it meets a certain threshold or goes out of range.

Suppose you added a new external API call to your website. You put the API behind a feature flag. You can measure the page load time and define it as a metric. If the load time exceeds a threshold and becomes too big, the feature flag can automatically disable the external API. Once disabled, you can show an error message to your users or switch to an alternate service. This is similar to the circuit breaker pattern but allows you to get all the benefits of feature flags including the ability to manually change anything using the Dashboard.

Pros and Cons of Automatic Kill Switches

Automatica kill switches are not a silver bullet. It does come with its own pros and cons.

The most obvious advantage is that you can revert a feature if it fails to meet your expectation or user expectations when it is made live, automatically. It doesn’t require a developer to wake up in the middle of the night to activate the kill switch automatically. The time your users are exposed to an error is also minimized.

3 Reasons Why You Shouldn’t Automate Your Kill Switch?

A kill switch is an excellent tool for controlling the behavior of the application. However, you need to be sure about choosing the metrics and their correlation. In other words, if you trigger a kill switch based on a certain metric, you need to be sure that the metric is relevant. For example, if you are using a number of exceptions over a period as an indicator, you may turn off the feature when in fact another part of the system could be causing these exceptions in the first place.

You shouldn’t automate kill switch on features that have been in production for a long time and are hardened. The backup path may not be working properly anymore. It’s best to automate kill switch on new features only. However, it is ok to have long lived kill switches to provide graceful degradation or cutover, but it’s best not to automate after a certain period.

Kill switches can also cause security problems if left unnoticed over time. A malicious user or hacker can use the kill switch without your knowledge.

Summary

Kill switch is the way to go forward as it lets you mitigate risks. If done right, it gives you the ability to ensure high quality of service. Automatic kill switches can activate automatically to disable a feature is a metric or KPI exceeds a certain threshold. An automated kill switch should be used for a short amount of time, usually a few weeks. After that time, it should be removed or left as a non-automated kill switch that can only be activated or deactivated manually.

Author: This blog was written by Nitish Singh who is a content creator for Unlaunch.

Unlaunch is the easiest way to roll out new features to the right audience - and protect your users from failure. Build better software with data. Sign up for a free account today.