110 comments
  • simonw6m

    The single biggest value add of feature flags is that they de-risk deployment. They make it less frightening and difficult to turn features on and off, which means you'll do it more often. This means you can build more confidently and learn faster from what you build. That's worth a lot.

    I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.

    A relational database lookup against primary keys in a table with a dozen records is effectively free. Heck, load the entire collection at the start of each request - through a short lived cache if your profiling says that would help.

    Once you start getting more complicated (flags enabled for specific users etc) you should consider build-vs-buy more seriously, but for the most basic version you really can have no-deploy-changes at minimal cost with minimal effort.

    There are probably good open source libraries you can use here too, though I haven't gone looking for any in the last five years.

    • tdumitrescu6m

      Seriously. This is one of those cases where rolling your own really does make sense. Flags in a DB table, flags in a json file, all super simple to build and maintain, and 100x faster and more reliable than making the critical paths of your application's request cycle depend on an external provider.

      • joshmanders6m

        You know what I would find worse than telling my customers that they can't access the application they paid for and works because I farmed my auth out to a 3rd party that is having an outage?

        Telling them that my auth provider isn't out, but the thing I use to show them a blue button vs a red button is.

        Oof.

        • gboss6m

          Has this actually been a problem? We’ve been using launch darkly for years and if they do have an outage (which is really really rare) the flag will be set to the default value. It’s also very very cheap, maybe $500 a month.

        • 6m
          [deleted]
      • twisteriffic6m

        We did this. Two tables. One for feature flags, with name, desc, id, enum (none, defaultToEnabled, overrideToDisabled). One for user flag overrides, with flagId, userId, enum (enabled, disabled).

        The combination of these two has been all we've ever needed. User segmentation, A/B testing, pilot soft launch etc are all easy.

        • uutangohotel6m

          Would you mind expanding on the usage of enums for the feature flags table? Why not use a boolean?

          • twisteriffic6m

            We actually did use booleans, I just found it easier to explain using enums, and the code would have been simpler if we'd done it that way.

      • PaulHoule6m

        In years of trying to sell things I've found that one of the best selling points to management is "susceptible to vendor lock-in", "you don't own your customer database", etc.

        I have no idea why that is.

        • dasil0036m

          I'm confused. Are you saying this ironically or have you literally pitched management with the risks of using your product?

          • PaulHoule6m

            I developed an open source "user management system" circa 2001 which I used on several sites, including one that had 400,000+ users, a famous preprint archive and the web site of a county-level green party. It was patterned on what sites like Yahoo and Amazon had at the time, did email verification and numerous things that were a hassle to implement, had great screens for the administrators, all of that.

            I couldn't get anybody else to adopt this software despite putting a lot of work into making it easy to pick up and install.

            10 years later competitors popped up like mushrooms and were adopted quickly. The thing they all had in common was somebody else owned your user database. So yeah I feel pretty cynical.

            • cyberax6m

              There's such a thing as being too early.

              My university had a great shared browser bookmark management system, even with a basic discussion support for them. In 1998. It was not super popular because people just didn't have that many links to share, eventually it fell offline and got accidentally deleted in 2001.

    • Kwpolska6m

      > using an (often expensive) feature flags as a service platform

      I have no idea why anyone would actually do that in real life. Feature flags are something so trivial that you can implement them from scratch in a few hours, tops — and that includes some management UI.

      • jitl6m

        Often these 3rd party offerings are feature flags PLUS experimentation with user segmenting. Depending on the style of software you build, this can be extremely valuable; it’s very popular in the SaaS market for a reason.

        Early on at Notion we used simple percent rollout in Redis, then we built our own flag & experimentation system, but as our needs got more complex we ended up switching to a 3rd party rather than dedicating a team to keep building out the internal system.

        We will probably hit a scale in a few years where it makes sense to bring this back in house but there’s certainly a sweet spot for the 3rd party version between the 50-500 engineer mark for SaaS companies.

        • boulos6m

          That's a reasonable path! You probably learned to appreciate and value the complexity, but you wouldn't have from the start. Which service do you use?

        • simonw6m

          That path sounds very sensible to me.

      • cogman106m

        Happens when you do the flags wrong :)

        We have a FF as a service platform and a big "value add" is that we can turn on and off features at the client level with it.

        But, unfortunately, it's both not the only mechanism for this and it is also being used for actual feature flags and not just client specific configuration.

        I'm personally a MUCH bigger fan of putting feature flags in a configuration file that you deploy either with the application or though some mechanism like kubernetes configs. It's faster, easier to manage, and really easy to quickly answer the question "What's turned on, why, and for how long". Because, a core part of managing feature flags is deleting them and the old code path once you are confident things are "right".

        The biggest headache of our FF-ws is that's really not clear and we OFTEN end up with years old feature flags that are on with the old code path still existing even though it's unexercised.

      • hirsin6m

        You'll still be building management UI over their system (it doesn't understand or validate actor types, tenants, etc, so you have to do that.).

        But at high throughput, you might want something with dedicated professional love. Ten thousand feature flags, being checked at around 2 (or 200) million RPS from multiple deployments... I don't want to be the team with that as their side project. And once you're talking a team of three to six engineers to build all this out, maybe it makes sense to just buy something for half a million a year. Assuming it can actually fit your model.

        • Spivak6m

          But it's not a side-project in most implementations it's part of the app itself.

          • jasonkarns6m

            Side project means it’s not that team’s primary focus.

        • fizx6m

          The scale is easy in practice, cause you outsource to a CDN. But everything takes time and has opportunity cost.

          • hirsin6m

            Maybe we've worked with different FF systems, but anything that involves a call more expensive than an RPC would be lethal to request latency. Calling out to a CDN forty five times per inbound request would be... Infeasible

            • fizx6m

              A background thread long polls the CDN, updating a local hashmap on change.

      • fizx6m

        If I was a bootstrapped startup, I'd do a json file and then when I've outgrown, I'd hand write something that long-polls a CDN for updates, with a tiny rails or react app behind the CDN.

        But these approaches are insane for companies above a certain size, where individuals are being hired and fired regularly, security matters, and feature flags are in the critical path of revenue.

        Last time I looked at LaunchDarkly Enterprise licensing, it started at $50k/year, and included SAML.

        Now that sounds like a lot, but if you're well past the startup stage, you need a tiny team to manage your homegrown platform. Maybe you have other things for them to do as well, but you probably need 3 people devoting at least 25% of their time to this, in order to maintain. So that's at least $175k/year in the USA, and if your company is growing, then probably the opportunity cost is higher.

      • ozim6m

        Add to that ideally feature flags should be removed after feature is released. Ideally also you shouldn’t have more than handful of feature flags.

        Permanent per customer configuration is not a feature flag. Also best would be not to have too many per customer configurations.

        • Supermancho6m

          Feature flags are ofter initially for feature and later left in as dependency flags. Even within a large organization, individual components and services by other teams will have outages.

          • ozim6m

            That sounds like bad engineering.

            Having all kinds of flags makes system prone to misconfigurations.

            As amount grows you get flags depending on other flags etc.

            Gets insane rather quickly. Unless someone purges flags relentlessly.

    • echelon6m

      > build-vs-buy

      Roll your own. Seriously.

      Feature flags are such an easy thing that there should be a robust and completely open source offering not tied to B2B SaaS. Until then, do it in house.

      My team built a five nines feature flag system that handled 200k QPS from thousands of services, active-active, local client caching, a robust predicate DSL for matching various conditions, percent rollout, control plane, ACLs, history, everything. It was super robust and took half an engineer to maintain.

      We ultimately got roped into the "build vs buy" / "anti-weirdware" crosshairs from above. Being tasked with migrating to LaunchDarkly caused more outages, more headache, and more engineering hours spent. We were submitting fixes to LaunchDarkly's code, fixing the various language client integrations, and writing our own Ruby batching and multiprocessing. And they charged us way more for the pleasure.

      Huge failure of management.

      I've been out of this space for some years now, but someone should "Envoy" this whole problem and be done with it. One service, optional sidecars, all the language integrations. Durable failure and recovery behavior. Solid UX. This shouldn't be something you pay for. It should be a core competency and part of your main tooling.

      • rav6m

        I don't understand what a dedicated "completely open source offering" provides or what your "five nines feature flag system" provides. If you're running on a simple system architecture, then you can sync some text files around, and if you have a more scalable distributed architecture, then you're probably already handling some kind of slowly-changing, centrally-managed system state at runtime (e.g. authentication/authorization, or in-app news updates, ...) where you can easily add another slowly-changing, centrally-managed bit of data to be synchronised. How do you measure the nines on a feature flag system, if you're not just listing the nines on your system as a whole?

        • foobazgt6m

          > If you're running on a simple system architecture,

          His point was that even a feature flag system in a complex environment with substantial functional and system requirements is worth building vs buying. If your needs are even simpler, then this statement is even more true!

          I'm having a hard time making sense out of the rest of your comment, but in larger businesses the kinds of things you're dealing with are:

          - low latency / staleness: You flip a flag, and you'll want to see the results "immediately", across all of the services in all of your datacenters. Think on the order of one second vs, say 60s.

          - scalability: Every service in your entire business will want to check many feature flags on every single request. For a naive architecture this would trivially turn into ungodly QPS. Even if you took a simple caching approach (say cache and flush on the staleness window), you could be talking hundreds of thousands of QPS across all of your services. You'll probably want some combination of pull and push. You'll also need the service to be able to opt into the specific sets of flags that it cares about. Some services will need to be more promiscuous and won't know exactly which flags they need to know in advance.

          - high availability: You want to use these flags everywhere, including your highest availability services. The best architecture for this is that there's not a hard dependency on a live service.

          - supports complex rules: Many flags will have fairly complicated rules requiring local context from the currently executing service call. Something like: "If this customer's preferred language code is ja-JP, and they're using one of the following devices (Samsung Android blah, iPhone blargh), and they're running versions 1.1-1.4 of our app, then disable this feature". You don't want to duplicate this logic in every individual service, and you don't want to make an outgoing service call (remember, H/A), so you'll be shipping these rules down to the microservices, and you'll need a rules engine that they can execute locally.

          - supports per-customer overrides: You'll often want to manually flip flags for specific customers regardless of the rules you have in place. These exclusion lists can get "large" when your customer base is very large, e.g. thousands of manual overrides for every single flag.

          - access controls: You'll want to dictate who can modify these flags. For example, some eng teams will want to allow their PMs to flip certain flags, while others will want certain flags hands off.

          - auditing: When something goes wrong, you'll want to know who changed which flags and why.

          - tracking/reporting: You'll want to see which feature flags are being actively used so you can help teams track down "dead" feature flags.

          This list isn't exhaustive (just what I could remember off the top of my head), but you can start to see why they're an endeavor in and of themselves and why products like LaunchDarkly exist.

        • echelon6m

          > if you're not just listing the nines on your system as a whole

          At scale the nines of your feature flagging system become the nines of your company.

          We have a massive distributed systems architecture handling billions in daily payment volume, and flags are critical infra.

          Teams use flags for different things. Feature rollout, beta test groups, migration/backfill states, or even critical control plane gates. The more central a team's services are as common platform infrastructure, the more important it is that they handle their flags appropriately, as the blast radius of outages can spiral outwards.

          Teams have to be able to competently handle their own flags. You can't be sure what downstream teams are doing: if they're being safe, practicing good flag hygiene, failing closed/open, keeping sane defaults up to date, etc.

          Mistakes with flags can cause undefined downstream behavior. Sometimes state corruption (eg. with complicated multi-stage migrations) or even thundering herds that take down systems all at once. You hope that teams take measures to prevent this, but you also have to help protect them from themselves.

          > slowly-changing, centrally-managed system state at runtime

          With flags being so essential, we have to be able to service them with near-perfect uptime. We must be able to handle application / cluster restart and make sure that downstream services come back up with the correct flag states for every app that uses flags. In the case of rolling restarts with a feature flag outage, the entire infrastructure could go hard down if you can't do this robustly. You're never given the luxury of knowing when the need might arise, so you have to engineer for resiliency.

          An app can't start serving traffic with the wrong flags, or things could go wrong. So it's a hard critical dependency to make sure you're always available.

          Feature flags sit so closely to your overall infrastructure shape that it's really not a great idea to outsource it. When you have traffic routing and service discovery listening to flags, do you really want LaunchDarkly managing that?

    • the_mitsuhiko6m

      > I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.

      The middle ground is a JSON file that is copied up and periodically refreshed. We (Sentry) moved from a managed software to just a YAML file with feature flags that is pushed to all containers.

      The benefit of just changing a file is that you have a lot of freedom of how you deal with it (eg: leave comments) and you have the history of who flipped it and for which reason.

      • maccard6m

        How do you push the files to all of your containers? I’ve done this in the past with app specific endpoints but never found a solution I liked with containers.

        • the_mitsuhiko6m

          We currently persist the feature flag config in a database where the containers pull it from. Not the optimal solution but that was a natural evolution from a system we already had in place.

        • superb_dev6m

          We keep a JSON blob in Google Secret Manager for our flags. The service running in the container will reload the secret anytime it changes

          • maccard6m

            Ah that’s a super nice feature. I’m mostly familiar with AWS who don’t have a neat way of doing this, you end up with a bespoke solution either with lambdas pushing to shared volumes or just polling s3 for updates.

            • withinboredom6m

              If you are using kubernetes, you can mount the secret/ConfigMap as a volume and it will be updated automatically when changes occur. Then your application merely watches the file for updates.

              • maccard6m

                Being on AWS, using EKS feels like overkill when you're talking $75/month just for having it managed by AWS. This doesn't work with ECS, unfortunately, or if you're just running docker on EC2.

            • moltar6m

              AWS has a native service for this called AppConfig and has agents that can pull and cache flag values so your services only need to make localhost requests.

              • maccard6m

                AH nice, I was not aware of this. Thanks (It is expensive, though...)

                • moltar6m

                  Expensive?? Really? One of the cheapest services around.

                  $0.0000002 per configuration request

                  • maccard6m

                    Depending on when you’re evaluating it’s a per request overhead. You might/provably have multiple flags per request. Compared to a lambda invocation that pushes a config file to every container if it changes, it’s expensive.

    • mattmanser6m

      I've been doing this a long time and seen a few different apps use config in database. There's different levels of config you're talking about here, but general app config should generally not go in a db.

      No-one ever changes the bloody things and it's just an extra thing to go wrong. If it only loads on startup, it achieves nothing over a bog standard config file. If it loads every request you've just incurred a 5% overhead on every call.

      And it ALWAYS ends up filled with crap that doesn't work anymore. Because unlike config files, no-one clear it up.

      Worse still is when people haven't made it injectable and then it means unit tests rely on a real database, or it blocks getting a proper CI/CD pipeline working.

      I end up having to pick the damn thing out of the app.

      Use a config file like everyone else that's probably built into the framework you're using.

      To be honest, most of the time I've seen it has been when people who clearly did not know their language/framework who wrote the app.

      I'm not saying it's you, but that's been my honest experience of config in the db, it's generally been a serious code smell that the whole app will be bad.

      • mjr006m

        There's differences to what kind of configuration you'd want to have in a config file (or environment variables, or some other "system level" management tooling) versus a feature flagging system.

        In my experience, feature flagging is more application-level than system-level. What I mean by that is, feature flagging is for stuff like: roll this feature out to 10% of users, or to users in North America, or to users who have opted into beta features; enable this feature and report conversion metrics (aka A/B testing); enable this experimental speedup for 15 minutes so we can measure the performance increase. It's stuff that you want to change at runtime, through centralized tooling with e.g. auditing and alerting, without restarting all of your application servers. It's a bit different than config for like "what's the database host and user", stuff that you don't want to change after initialization (generally).

        Regarding the article though, early on your deployment pipeline should be fast enough that updating a hardcoded JSON file and redeploying is just as easy as updating a feature flag, so I agree it's not something to invest in if you're still trying to get your first 1000 users.

      • marcosdumay6m

        For some kind of software, another call to the DB is the best way to add bog-standard functionality without adding complexity and failure modes.

        Granted, not for all software. And there's something to be said about a config file that you can just replace at deployment. But that's something that varies a lot from one environment to another.

    • secondcoming6m

      > feature flags in a JSON file that you have to redeploy to change

      Our config files are stored in their own repo. Pushes to the master branch trigger a Jenkins job that copies the config files to a GCP bucket.

      On startup, each machine pulls this config from GCS and everything just works.

      It's not a 'redeployment' in the sense that we don't push new images on each config change.

      • j-krieger6m

        We do the same thing but slightly differently. If a new docker image is built, we deploy that image. If the config changes, an ansible job moves that config to the target host and the service is restartet with that new config file. Configs are mounted inside containers. It all runs on GitLab CI/CD.

    • j456m

      Great summary.

      Just starting with them and learning to improve your application of them is the best way to learn, too.

      There is one book on feature flags that had been written earlier, some of the independently published books by experienced tech folks out there are a goldmine.

      Feature Flags by Ben Nadel is one such book for me. There is an online version that is free as well. Happy to learn about others.

      https://featureflagsbook.com/

    • adamtaylor_136m

      Heck if your user system is just a Users table, you don’t even really need to consider build vs buy for them either.

      If you start doing it for sub-groups, hard agree but this is a space where it almost always pays dividends to roll your own first. The size of a company that needs to consider adding feature flags (versus one that already has them) is typically that in which building your own is quicker, cheaper, and most importantly: simpler.

    • traverseda6m

      Why aren't you just using environment variables for feature flags?

      Have people still not bought into the whole 12 factor config things?

      • jitl6m

        When your app starts to get bigger and more complex, the idea of needing to restart a process to pick up any new kind of data starts to seem silly.

        Have seen the pattern many times:

        Hard-code values in code -> configure via env -> configure slow things via env and fast things via redis -> configure almost everything via a config management system

        I do not want to reboot every instance in a fleet of 2000 nodes just to enable a new feature for a new batch of beta testers. How do I express that in an env var anyways? What if I have 100s of flags I need to control?

        In other cases I need some set of nodes to behave one way, and some set of nodes to behave another way - say the nodes in us-west-2 vs the nodes in eu-central-1. Do I really want to teach my deploy system the exhaustive differences of configuration between environments? No I want my orchestration and deploy layer to be as similar as possible between regions, and push almost everything besides region & environment identification into the app layer - those two can be env vars because they basically never change for the life of the cluster.

        • thraxil6m

          I would add two things:

          It's often important that flag changes be atomic. Having subsequent requests get different flag values because they got routed to different backend nodes while a change is rolling out could cause some nasty bugs. A big part of the value of feature flags is to help avoid those kind of problems with rolling out config changes; if your flags implementation suffers from the same problem, it's not very useful.

          Second, config changes are notorious as the cause of incidents. It's hard to "unit test" config changes to the production environment the same way you can with application code. Having people editing a config every time they want to change a flag setting (we're a tiny company and we change our flags multiple times per day) seems like a recipe for disaster.

          • withinboredom6m

            Making changes atomic is literally impossible, it's easier to just assume they won't be than chasing down something computer science tells us is impossible. I assume you are saying "every node sees the same change at the same time" when you say "atomic."

            As for unit testing flags, you better unit test them! Just mock out your feature flag provider/whatever and test your feature in isolation; like everything else.

            • thraxil6m

              It seems like you've kind of missed both of my points.

              If you're doing canary deploys to a fleet of 2000 nodes, it might take hours for the config to make it to all of them (I've seen systems where a fleet upgrade can take a week to make it all the way out). If your feature flags are configured that way, there's a long time that the state of a flag will be in that in-between state. We put feature flags in the database not config/environment so that we can turn a feature on or off more or less atomically. Ie, an admin goes into the management interface, flips a flag from off to on and then every single request that the system serves after that reflects that state. As long as you're using a database that supports transactions, you absolutely can have a clear point in time that delineates before/after that change. Rolling out a config change to a large fleet, you don't get that.

              On the second point, what I'm saying is that (talk to your friendly local SRE if you don't believe me), a large percentage of production incidents in large systems are because of configuration changes, not application changes. This is because those things are significantly harder to really test than application code. Eg, if someone sets an environment variable for the production environment like `REDIS_IP=10.0.0.13` how do you know that's the correct IP address in that environment? You can add a ton of linting, you can do reviews, etc, but ultimately, it's a common vector for mistakes and it's one of the hardest areas to completely prevent human error from creating a disaster. One of the best strategies we have is to structure the system so you don't have to make manual environment/config changes that often. If you implement your feature flag system with environment variables/config, you'll be massively increasing the frequency that people are editing and changing that part of the system, which increases the chances of somebody making a typo, forgetting to close a quote, missing a trailing comma in a json file, etc.

              Where I work we make production config changes maybe once a week or so and it's done by people who know the infrastructure very well, there's a bunch of linting and validation, and the change is rolled out with a canary system. In contrast, feature flags are in the database and we have a nice, very safe custom UI so folks on the Product and Support teams can manage the flags themselves, turning them on/off for different customers without having to go through an engineer; they might toggle flags a dozen times a day.

        • traverseda6m

          How do you do software upgrades if you don't have a good system for handling process restarts without downtime?

          • tetha6m

            Then again, speed and performance.

            At my last job, updating a productive game server cluster took an hour or so with minimal to no customer interruption. Though you could still see and measure how the systems needed another hour or two to get their JIT'ers, database caches, code caches and all of these things back on track. Maybe you can just say "then architect better" or "just use rust instead of Java", but the system was as it was and honestly, it performed vey very well.

            On the other hand, the game servers checked once a minute what promotion events should be active every minute from the marketing backend and reacted to it without major caching/performance impacts.

            Similar things at my current place. Teams have stable and reliable deployment mechanisms that can bring code to Prod in 10 - 15 minutes, including rollbacks if necessary. It's still both safer to gate new features behind feature toggles, and faster to turn feature toggles on and off. Currently, such per-customer configs apply in 30 - 60 seconds across however many applications deem it relevant.

            I would have to think quite a bit to bring binaries to servers that quickly, as well as coordinate restarts properly. The latter would dominate the time easily.

          • jitl6m

            Software updates happen once per two hours, config changes happen once per 5 minutes or faster.

            A few days ago I’m tuning performance parameters for a low latency stream processing system, I can iterate in 90 seconds by twiddling some config management bits for 30s in the CLI, watch the graphs for 60s, then repeat.

          • zoogeny6m

            I mean, isn't that even worse?

            If I have 100 servers and I'm doing rolling deploys then I'm going to be in a circumstance where some ratio of my services are in one state and some ratio are in another state.

            If I am reading per-request from redis (even with a server cache) I have finer-grained control.

            For me it is a question of "is the config valid for the life of this process" vs. "is this config something that might change while this process is alive".

      • secondcoming6m

        How do environment variables help? You still need something that knows what values to set the env vars to.

    • ljm6m

      I think openfeature.dev is an attractive proposition these days - start off with an env-based provider or roll your own and if you get to a point where you need to buy, you only need to swap over a provider (or use a multi-provider).

      • j-krieger6m

        I like openfeature. Initially I thought that it was overengineered and quote honestly, I never grew to a size where I had to use anything but an env-provider paired with out CI/CD pipeline. But it gave me security that A/B testing would be possible if needed, and, more importantly, we had a unified API for feature flags and they were all defined in one place.

    • elliot076m

      I agree with a lot of this, except for the part about de-risking deployments. That should not be a reason why to adopt a feature flag platform - that is a symptom of a bad deployment pipeline that should be fixed which is a whole other story.

      • CBLT6m

        I disagree that using feature flags to de-risk deployments is a symptom of bad deployment pipelines.

        There's several aspects of deployments that are in contention with each other: safety, deployment latency, and engineering overhead are how I'd break it down. Every deployment process is a tradeoff between these factors.

        What I (maybe naively) think you're advocating is writing more end-to-end tests, which moves the needle towards safety at the expense of the other factors. In particular, having end to end tests that are materially better than well-written k8s health checks (which you already have, right?) is pretty hard. They might be flakey, they might depend on a lot of specifics of the application that's subject to change, and they might just not be prioritized. In my experience, the highest value end-to-end tests are based on learned experiences of what someone already saw go wrong once. Writing comprehensive testing before the feature is even out results in many low quality tests, which is an enormous drain on productivity to write them, to maintain them, and to deal with the flakey tests. It is better, I think, to have non-comprehensive end-to-end tests that provides as much value for the lowest overhead on human resources. And the safety tradeoff we make there can be mitigated by having the feature behind a flag.

        My whole thesis, really, is that by using feature flags you can make better tradeoffs between these than you otherwise could.

      • tomnipotent6m

        > That should not be a reason why to adopt a feature flag platform

        It's one of the two big reasons. First is the ability to rollout features gradually and separate deployments from feature release, and second is the ability to turn new features off when something goes wrong. Even part of the motivation of A/B testing is de-risking.

      • duke_sam6m

        The risk of deployments isn’t entirely technical. Depending on your business and customer base it might be necessary for some groups to have access to the feature earlier or later than others.

      • mlinhares6m

        Strong disagree here, my whole org does not roll out changes without feature flags at all and whenever someone doesn't follow this policy they cause large scale incidents. Feature flags are actually a sign the deployment pipeline is very sane and mature, because people understand any new code comes with unexpected risks and we should prevent these risks from taking down systems.

      • __turbobrew__6m

        Sometimes the only way to try out a distributed system is to run it in prod and see what happens. Having the tools to flip behaviour within 1 second globally can be a useful escape hatch. When you get to large enough scales “just roll back” is not always good enough. I deploy systems with tens of thousands of nodes and we specifically have to rate limit how fast we deploy so we don’t cause thundering herds.

      • scott_w6m

        Very few teams have instant deployments. Even fast systems take a few minutes to run. If you can turn off a flag faster (because it’s a DB record), then you should do that.

  • vijayer6m

    The call out on premature optimization is valid. However this article misses the mark on a couple fronts.

    One, as others have called out, is the ability to control rollout (and rollback) without needing a deployment. Think mobile apps and the rollout friction. If something goes wrong, you need a way to turn off the offending functionality quickly without having to go through another deployment or a mobile app review.

    Second, is to be able to understand the impact of the rollout. Feature flags can easily measure how the rollout of one feature can affect the rest of the system - whether it is usage, crash rates, engagement, or further down the funnel, revenue. It’s a cheat code for quickly turning every rollout into an experiment. And you don’t need a large sample size for catching some of these.

    By having this power, you will find yourself doing more of it, which I believe is good.

  • dave44206m

    If you have enough traffic then you’ll want to roll out new features gradually, and revert them quickly if despite your testing it causes trouble in production.

    If you don’t have much traffic, and can live with having to redeploy to flip the switch, then fine, stick it in a config file.

    But I clicked through expecting a defence of hard coding feature flags in the source code (`if true` or `if customerEmail.endsWith(“@importantcustomer.com”)`). I very don’t approve of this.

    • 3eb7988a16636m

      That specific example feels like it might be ok? Presumably you have a very slow process by which customers are identified as VIP white-glove whales. Hard-coding the account representing X% of revenues is not going to experience a lot of churn. Just make it a collection variable, so you do not repeat yourself in multiple places.

  • keybored6m

    One sense of feature flag that I am familiar with (not from experience) is in trunk based development where they are used to integrate new code (with a feature flag) which is relatively untested. Or just not fully developed. That’s an alternative to longer-lived feature branches which are only merged until it is either fully finished or (going further) fully tested. Hard-coding that kind of feature flag makes sense. Because a later revision will delete the feature flags outright (removing the branches).

    There also seems to be feature flags in the sense of toggling on and off features. Then hard-coding makes less sense.

    • CRConrad6m

      > One sense of feature flag that I am familiar with (not from experience) is in trunk based development where they are used to integrate new code (with a feature flag) which is relatively untested. Or just not fully developed. That’s an alternative to longer-lived feature branches which are only merged until it is either fully finished or (going further) fully tested.

      That's actually the only sense of "feature flag" I was aware of before this discussion.

      > Hard-coding that kind of feature flag makes sense. Because a later revision will delete the feature flags outright (removing the branches).

      Yup. And, AFAIK, is what "feature flag" means.

      > There also seems to be feature flags in the sense of toggling on and off features. Then hard-coding makes less sense.

      So "feature flag" has now taken on -- taken over? -- the meaning of just plain "flag" (or "switch" or "toggle" or whatever), as in ordinary everyday run-time configuration? What is this development supposed to be good for? We used to have two distinct distinguishable terms for two distinct distinguishable things; now we apparently don't any more. So we've lost a bit of precision from the language we use to discuss this stuff. Have we, in exchange, gained anything?

  • forinti6m

    I've had an issue with gitlab feature flags when gitlab became unavailable. I couldn't fire a new deploy and the system wouldn't work until gitlab came back to life.

    That was a stupid dependency.

    • fiddlerwoaroof6m

      This sounds like an integration issue: systems like LaunchDarkly typically allow you to specify a default value for when the feature flag server can’t be reached.

      • jitl6m

        And/or build a near cache so you treat the 3rd party as a control layer, but actually serve requests from your near cache as data layer. Then when 3rd party goes down, your app doesn’t notice at all, and you can still manually update/override values in the cache in emergencies.

  • ourmandave6m

    Also be sure to use descriptive names so the guys who disassemble your code can write articles about upcoming features.

  • andix6m

    It's even okay to hardcode them into code (not a config/json file). Depending on the build pipeline this is similar to preprocessor flags, and the code will be removed during build.

    It might be enough to test new features with a limited audience (beta build, test deployments for stakeholders/qa).

    If done correctly this solution can be easily extended to use a feature flag management tool, or a config file.

    PS: removing new features during build/tree-shaking/etc adds some additional security. In some cases even disabled features could pose a security risk. Disabled features are often not perfectly tested yet.

    • CRConrad6m

      > It's even okay to hardcode them into code (not a config/json file).

      Yes, as I understood it that was what the article was all about.

      • andix6m

        The article suggests to put them into a config file, and considers it hardcoding. That’s how I understood it at least.

        • CRConrad6m

          Ah, yes indeed, seems I'd misread it; sorry.

          (Sheesh, WTF is that guy talking about??? Now not only "feature flag" doesn't mean anything any more, but "hardcoded" doesn't either!)

  • Narciss6m

    My team just had an issue where a new feature caused our live app to grind to a halt. One of the key reasons it took so long to fix is that the dev in charge of the feature had removed the remote feature flag earlier that day.

    Redeploying takes time. Sometimes you want to disable something quick. Having a way to disable that feature without deploys is amazing in those cases.

    That being said, there’s really no need to rely for a dedicated service for this. We use our in house crm, but we also have amplitude for more complex cases (like progressive rollout)

  • jdwyah6m

    There is something too this, though jumping all the way to DIY is unnecessary.

    Context: I run a FF company (https://prefab.cloud/)

    There are multiple distinct benefits to be had from feature flagging. Because it's the "normal" path, most FF products bundle them all together, but it's useful to split them out.

    - The code / libraries for evaluating rules. - The UI for creating rules, targeting & roll outs. - The infrastructure for hosting the flags and providing real-time updates. - Evaluation tracking / debugging to help you verify what's happening.

    If you don't need #1 and #2 there, you might decide to DIY and build it yourself, but I think you shouldn't have to. Most feature flag tools today are usable in an offline mode. For Prefab it is: https://docs.prefab.cloud/docs/how-tos/offline-mode You can just do a CLI command to download the flags. Then boot the client off a downloaded file. With our pricing model that's totally free because we're really hardly doing anything for you. Most people use this functionality for CI environments, but I think it's a reasonable way to go for some orgs. It has 100% reliability and that's tough to beat.

    You can do that if you DIY too, but there's so many nice to haves in actually having a tool / UI that has put some effort into it that I would encourage people not to go down that route.

  • whoknowsidont6m

    There's a typo in the article:

    >Hardoced feature flags

    Think the author obviously meant "hardcoded" here.

    Anyways, recently, this has been really hard to sell teams on in my experience. At some point "feature flag" became equivalent to having an entire SaaS platform involved (even for systems where interacting with another SaaS platform makes little sense). I can't help but wonder if this problem is "caused" by the up-coming generation of developers' lived experience with everything always being "online" or having an external service for everything.

    In my opinion, your feature flag "system" (at least in aggregate) needs to be layered. Almost to act as "release valves."

    Some rules or practices I do:

    * Environment variables (however you want to define or source them) can and should act as feature flags.

    * Feature flag your feature flag systems. Use an environment variable (or other sourced metadata, even an HTTP header) to control where your program is reading from.

    * The environment variables should take both take priority if they're defined AND act as a fallback in case of detected or known service disruption with more configurable feature flag systems (such as an internal DB or another SaaS platform).

    * Log the hell out of feature flags, telemetry will keep things clean (how often flags are read, and how often they're changed).

    * Categorize your feature flags. Is this a "behavioral" feature flag or functional (i.e., to help keep the system stable). Use whatever qualifiers make sense for your team and system.

    * Remove "safety" flags for new features/releases after you have enough data to prove the release is stable.

    * Remove unused "behavior" flags once a year.

    My $0.02

    • mcdoh6m

      There's a typo in the post:

      > Anyways

      I think you obviously meant "anyway".

      • whoknowsidont6m

        "Anyways" is not a typo. It's a well used term in informal contexts.

  • cluckindan6m

    Just put your flags in environment variables.

    Depending on your infra, that can already make them toggleable without a redeployment: a restart of the apps/containers on the new envvars is enough.

    Having them in a separate file would be useful if you need to be able to reload the flags upon receiving SIGUSR1 or something.

  • eqvinox6m

    > Simply start with a simple JSON file, read it in at application startup,

    That's not what I'd call hardcoding, it's a startup-time configuration option. Hardcoding is, well, "hard coding", as in changing something in the source code of the affected component, in particular with compiled languages (with interpreted languages it the distinction is a bit mushy.)

    And then for compilation there is the question whether it is a build system option (with some kind of build user interface) or "actual" hardcoding buried somewhere.

    Also, there is a connection to be drawn here to loadable/optional software components. Loading or not loading something can be implemented both as startup-time or runtime decision.

  • dlevine6m

    I have found feature flagged rollouts to be one of the biggest advances in fairly recent software development. Probably too much to say about it in a comment, but they massively de-risk launches in a number of important ways, both in being able to quickly turn a feature off if it has unintended consequences and being able to turn it on for a very specific set of users.

    With that said, I think that LaunchDarkly and the like are a bit expensive and heavyweight for many orgs, and leaving too many feature flags lying around can become serious debt. It totally makes sense to start with something lighter weight, e.g. an env var or a quick homegrown feature in ActiveAdmin.

    • adamredwoods6m

      If one of these services was able to catalog commits with the FF, that's be worth gold (wink wink Launch Darkly).

  • troy55_yort556m

    Sometimes the flag management is more complex than dealign with config - but it's not ideal. Hardcoding flags can be difficult to manage across different environments, regions, customers, etc. Atono.io has flags built right into the stories and it's free for up to 5 users. No limits on the number of flags. Also helps with planning, and bug tracking. Keeps things nice and tidy.

  • bradgessler6m

    I built a Ruby library for feature flags that’s completely hardcoded.

    In a rails app you have an ./app/plans folder with Ruby files, each command containing a class that represents each feature in a plan.

    There’s code examples at https://github.com/rubymonolith/featureomatic if you want to have a look.

    I’ve used it for a few apps now and it’s been pretty useful. When you do need plug a database into it, you can do so by having the class return the values from a database.

  • fs_software6m

    It seems like the more deployment friction there is for your project, the more the scale tips toward buy vs build here.

    As a solo dev, something lightweight and cost-effective like this is attractive. Deployment is a CLI command or PR merge away.

    Would I recommend it for the 200 person engineering org that deploys at most once a week? Probably not.

  • frank000016m

    The whole point is to reduce the risk of the deployments by easily managing the flags - often without the need for a developer. Also, using services like Azure App Configuration gives you the possibility to have different distribution based on users, groups, country etc, which is a really useful.

    A bit surprised people rolls their own implementation when this is easily available, and not that expensive. At least in Azure.

  • rickcarlino6m

    This statement is not true if your organization has very stringent deployment requirements or a release cycle that is measured in days rather than hours. If you can go from idea to deployment in under an hour, sure, you probably don’t need a fancy feature flag system. That’s not the case for a lot of organizations. Source: spent a really long time fixing agony caused by hardcoded feature flags.

  • tnorgaard6m

    If I may make an suggestion: Instead of a static json file, read at boot, I'd suggest passing the feature flags down per request as a header, or a pointer to the set of feature flag. So that all systems for a given request, observe the same features. Just my 2 cents.

  • stevebmark6m

    This blog post is irrelevant. Stub the feature flag system in code and write tests for both cases.

  • rizpanjwani6m

    A reason to use feature flag third party services is for the analytics and insights into the user behaviour. Such services cost a lot but might be worth it if you don't have the resources to build such an infrastructure.

  • kiitos6m

    Hard-coded feature flags are more commonly referred to as configuration.

    The primary purpose of feature flags is to provide a way to change system behavior dynamically, without needing a deploy.

    • CRConrad6m

      > The primary purpose of feature flags is to provide a way to change system behavior dynamically, without needing a deploy.

      That's not feature flags; it's just ordinary configuration. (Actually, seems many contributions to this discussion get those mixed up. Maybe even TFA itself.)

      • kiitos6m

        Ordinary configuration is updated via deployment. Feature flags are updated without deployment. This is the fundamental difference.

        • CRConrad6m

          Eh, what? No, it is -- or at least used to be -- the other way around. Features are what the programmer puts in the actual application code. "Feature flags" are a (new-ish) way of making the transfer from older to newer versions of the application code more manageable for the programmer: If they're actually "hard-coded" (as, say, boolean feature-on-or-off constants) in the application code, they require a deployment of the actual executable; if they're "hard-coded" in the totally new-fangled sense of the word that seems to be emerging here (i.e. not actually hard-coded at all), it may require deployment of a "features configuration file" or such.

          Just plain "configuration", though, is how the user sets up their software to work. That's saved in a local configuration file (or the Windows Registry or wherever) under the user's control, and doesn't require any deployment by the developer at all.

          Ihan oikeasti, nykynuoriso...

  • kvrty6m

    An area where hardcoding may not work is when multiple teams and services have to turn something on or off. It is much better to have something external.

  • mpolun6m

    [dead]

  • davydm6m

    1. It's ok if deployment is quick and no customers are affected, but if you're running something like a whitelabel fast food app platform with a deployment cycle that takes a minute, that's enough time to get irate customers 2. The author obviously doesn't understand the importance of flags for new features which may break the customer experience when they dumb it down to the color of a button. I stopped wasting my time reading when I got there.

    • eastbound6m

      Funnily enough, deployments are faster than changing the environment variables. Probably something to do with the high sensitivity of some variables.