This topic includes some recommended best practices for using LaunchDarkly Experimentation.
As you use Experimentation, consider the below best practices.
This is a general best practice, but it especially helps when you’re running experiments in LaunchDarkly. By flagging every feature, you can quickly turn any aspect of your product into an experiment. Wrap the smallest unit of behavior you might want to measure or roll back independently, so you are not forced to ship or test an entire release as one block.
This creates a culture of experimentation that helps you detect unexpected problems and refine and pressure-test metrics. Treat experiments as a normal part of shipping: the more you run, the better you calibrate which metrics actually move when you change the product.
Create hypotheses in the planning stage of feature development, so you and your team are ready to run experiments as soon as your feature launches. Write down what you expect to change (for example, higher completion rate or lower latency) and what would surprise you, so results are easier to interpret later. To learn more, read Designing experiments.
Align with your team on which tangible metrics you’re optimizing for, and what results constitute success. Prefer a small set of primary metrics tied to the decision you need to make, and treat secondary metrics as backups that might justify a rollout even when the primary metric doesn’t show much change.
If you’re running multiple experiments simultaneously, make sure they don’t collect similar or conflicting data. When two experiments might affect the same user journey or metric, use Mutually exclusive experiments or adjust targeting so you can attribute outcomes clearly.
If someone accesses your experiment from both a logged out and logged in state, each state will generate its own context key. You can associate multiple related contexts together using multi-contexts. To learn more, read Associate anonymous contexts with logged-in end users.
Traffic, baseline conversion, and the size of the effect you hope to detect all affect how quickly you can reach reliable conclusions. To learn more, read Experiment sample size and run time.
End an experiment when you have enough data to reach a conclusion. Avoid leaving experiments running indefinitely after you have already acted on the results. For help in determining how long to run experiments, read Experiment sample size and run time.
A small, stable group that does not receive flagged changes can help you measure cumulative lift from many experiments over time. To learn more, read Holdouts.
You can send all of your experiment data to an external warehouse, including BigQuery, Databricks, Redshift, or Snowflake, using a warehouse Data Export integration. By exporting your LaunchDarkly experiment data to the same warehouse as your other data, you can build custom reports to answer product behavior questions. To learn more, read Warehouse Data Export.
You can also run experiments using warehouse native metrics. To learn more, read Creating experiments using warehouse native metrics.