Experimentation best practices | LaunchDarkly

This topic includes some recommended best practices for using LaunchDarkly Experimentation.

Best practices

As you use Experimentation, consider the below best practices.

Use feature flags on every new feature you develop

This is a general best practice, but it especially helps when you’re running experiments in LaunchDarkly. By flagging every feature, you can quickly turn any aspect of your product into an experiment. Wrap the smallest unit of behavior you might want to measure or roll back independently, so you are not forced to ship or test an entire release as one block.

Run experiments on as many feature flags or AgentControl configs as possible

This creates a culture of experimentation that helps you detect unexpected problems and refine and pressure-test metrics. Treat experiments as a normal part of shipping: the more you run, the better you calibrate which metrics actually move when you change the product.

Consider experiments from day one

Create hypotheses in the planning stage of feature development, so you and your team are ready to run experiments as soon as your feature launches. Write down what you expect to change (for example, higher completion rate or lower latency) and what would surprise you, so results are easier to interpret later. To learn more, read Designing experiments.

Define what you’re measuring

Align with your team on which tangible metrics you’re optimizing for, and what results constitute success. Prefer a small set of primary metrics tied to the decision you need to make, and treat secondary metrics as backups that might justify a rollout even when the primary metric doesn’t show much change.

Plan your experiments in relation to each other

If you’re running multiple experiments simultaneously, make sure they don’t collect similar or conflicting data. When two experiments might affect the same user journey or metric, use Mutually exclusive experiments or adjust targeting so you can attribute outcomes clearly.

Associate end users who interact with your app before and after logging in

If someone accesses your experiment from both a logged out and logged in state, each state will generate its own context key. You can associate multiple related contexts together using multi-contexts. To learn more, read Associate anonymous contexts with logged-in end users.

Estimate how long you need to run the experiment

Traffic, baseline conversion, and the size of the effect you hope to detect all affect how quickly you can reach reliable conclusions. To learn more, read Experiment sample size and run time.

Start and stop experiments deliberately

End an experiment when you have enough data to reach a conclusion. Avoid leaving experiments running indefinitely after you have already acted on the results. For help in determining how long to run experiments, read Experiment sample size and run time.

Use holdouts to validate your experimentation program

A small, stable group that does not receive flagged changes can help you measure cumulative lift from many experiments over time. To learn more, read Holdouts.

You can also send data to and run experiments with external warehouses

You can send all of your experiment data to an external warehouse, including BigQuery, Databricks, Redshift, or Snowflake, using a warehouse Data Export integration. By exporting your LaunchDarkly experiment data to the same warehouse as your other data, you can build custom reports to answer product behavior questions. To learn more, read Warehouse Data Export.

You can also run experiments using warehouse native metrics. To learn more, read Creating experiments using warehouse native metrics.