Recently, I've had a number of conversations around the benefits of and differences between feature flags, deployment automation, and config files. Several folks have asked me, “why do I need feature flags if I can just have a deployment pipeline that manages percentage rollouts?” There are two parts to my answer. Part one is, “not all rollouts start with a canary and are intended to reach 100%.” I feel like this point is covered well here and here. The second part of my answer has to do with granularity.
Define your scope
First, it is important to make certain we are all on the same page.
Feature flags are control points in your code that allow two or more variations of a code path to be deployed and running in production simultaneously. Users are presented with only one variation (or code path) at a time, but each variation can be changed without deploying new code.
Deployment automation is a term used by vendors like JFrog, Armory, and Harness to convey the ability to build repeatable pipelines for moving a deployment package (typically a service or application) from raw code to executable code that is running on an endpoint (server or client).
Config files are non-compiled flat files for storing variables that can be referenced by an application or service. The advantage to using a config file is an operator can change the values in the config file and re-initialize the application to pick-up those changes. Config files can also be stored in a source control repository (like git) so that they can be version controlled.
Buildings vs. rooms
I find the following analogy useful in conveying the value of using feature flags in addition to deployment automation.
Picture a building. This building can be small or large, one story or many stories, have a single room or hundreds of rooms, etc. The key is that the building, no matter the shape or size, takes a bit of time and effort to create. And if you want to replace the building, you have to remove the old one before you can stand up the new one. If people are living or working in the building, you might build a second one that is exactly the same just in case the first one has any problems, and you need to tear it down and rebuild it. In the code world, this is called a blue/green deployment.
Typically speaking, when we construct buildings, we put lights inside so that we can see better and avoid running into walls and each other. Most often, we add switches for these lights. Interestingly, the placement of these switches and the number of lights they control tends to vary based on how the space will be used and the level of control the builder wants to provide to the users.
For example, an old large factory may have a few master switches that illuminate the entire structure. All the lights are on or off. This is great for a factory that expects all users to be there at the same time and have roughly the same needs in terms of lighting.
Another example would be a similar-sized building that was used as an apartment building for hundreds of tenants with many rooms and hallways. In this case, each living unit has multiple switches to turn on and off lights individually to meet the needs of the individual users.
Keeping the lights on
In the examples above, the building represents your service or application. Your service could be a large monolith or several microservices. Each instance can be thought of as a building. Some teams get really good at building new buildings. They can stand them up really fast and have tools like blueprints to make certain they have consistency for each building they build.
Feature flags represent the light switches. It's up to the builders to decide where to put the switches and how many are needed to support the users of the space.
The nice thing about switches is that you can turn them on and off while the users are inside the building. If you don't have any switches, and you need to turn the lights off in one small room, your only option is to evacuate the entire building and tear it down. This takes far more time and effort than turning off a switch.
Too many switches
As you build new buildings and you have a better understanding of how your users use the space, you can remove switches from rooms where they are not needed and add them to new rooms that offer cool new features that you may not have usage patterns for. In the same way, you should look to reduce the use of feature flags as you optimize your code paths to align with the needs and behaviors of your users.
Self-serve switches vs. an electrician
That said, there are many ways to keep the lights on. Another option is to have an electrician on staff that can make changes to the wiring without having to demolish the entire building. The code equivalent to this would be the config file. This is typically a file that lives in the service or application that can be edited to change the behavior or the way the code is experienced by the users. Config files are great when you are looking to have an initial state that is consistent, easily validated, and version controlled. However, similar to deployment automation they lack the flexibility of feature flags.
We can compare feature flags to your standard residential light switch: easy to use, cheap to install. Changing a config file is more like redoing the wiring in the building. Generally, you ask the users to leave while you do the work because lights may go off randomly. In fact, with a config file change, the service or application will likely need to be restarted to have the changes go into effect. While this is a much lighter lift than standing up a new building, it is still far more disruptive than flipping a switch.
* * *
Ultimately, feature flags offer the most painless way to control all the features within your application. And they give you peace of mind when releasing new features, owing to the following: if a feature causes a bug in production, you can reverse it immediately. Just like turning a light off.