For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inTry it free
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
  • Guides
    • Cheatsheets
    • Feature flags
    • AgentControl
    • Experimentation
      • Creating an A/B experiment using a funnel metric group
      • Creating mutually exclusive experiments
      • Designing experiments
      • Example experiments
      • Experimentation best practices
      • Bayesian versus frequentist statistics
      • Maintaining consistency across user sessions when running experiments
      • Measuring Experimentation impact with holdout experiments
      • Migrating from Statsig to LaunchDarkly Experimentation
      • Proving ROI with data-driven AI agent experiments
      • Sample size calculations for frequentist experiments
    • Statistical methodology
    • Metrics
    • Infrastructure
    • Account management
    • Teams and custom roles
    • SDKs
    • Integrations
    • REST API
    • Additional resources
Sign inTry it free
LogoLogo
On this page
  • Best practices
  • Use feature flags on every new feature you develop
  • Run experiments on as many feature flags or AgentControl configs as possible
  • Consider experiments from day one
  • Define what you’re measuring
  • Plan your experiments in relation to each other
  • Associate end users who interact with your app before and after logging in
  • Estimate how long you need to run the experiment
  • Start and stop experiments deliberately
  • Use holdouts to validate your experimentation program
GuidesExperimentation

Experimentation best practices

Was this page helpful?
Previous

Bayesian versus frequentist statistics

Next
Built with

This topic includes some recommended best practices for using LaunchDarkly Experimentation.

Best practices

As you use Experimentation, consider the below best practices.

Use feature flags on every new feature you develop

This is a general best practice, but it especially helps when you’re running experiments in LaunchDarkly. By flagging every feature, you can quickly turn any aspect of your product into an experiment. Wrap the smallest unit of behavior you might want to measure or roll back independently, so you are not forced to ship or test an entire release as one block.

Run experiments on as many feature flags or AgentControl configs as possible

This creates a culture of experimentation that helps you detect unexpected problems and refine and pressure-test metrics. Treat experiments as a normal part of shipping: the more you run, the better you calibrate which metrics actually move when you change the product.

Consider experiments from day one

Create hypotheses in the planning stage of feature development, so you and your team are ready to run experiments as soon as your feature launches. Write down what you expect to change (for example, higher completion rate or lower latency) and what would surprise you, so results are easier to interpret later. To learn more, read Designing experiments.

Define what you’re measuring

Align with your team on which tangible metrics you’re optimizing for, and what results constitute success. Prefer a small set of primary metrics tied to the decision you need to make, and treat secondary metrics as backups that might justify a rollout even when the primary metric doesn’t show much change.

Plan your experiments in relation to each other

If you’re running multiple experiments simultaneously, make sure they don’t collect similar or conflicting data. When two experiments might affect the same user journey or metric, use Mutually exclusive experiments or adjust targeting so you can attribute outcomes clearly.

Associate end users who interact with your app before and after logging in

If someone accesses your experiment from both a logged out and logged in state, each state will generate its own context key. You can associate multiple related contexts together using multi-contexts. To learn more, read Associate anonymous contexts with logged-in end users.

Estimate how long you need to run the experiment

Traffic, baseline conversion, and the size of the effect you hope to detect all affect how quickly you can reach reliable conclusions. To learn more, read Experiment sample size and run time.

Start and stop experiments deliberately

End an experiment when you have enough data to reach a conclusion. Avoid leaving experiments running indefinitely after you have already acted on the results. For help in determining how long to run experiments, read Experiment sample size and run time.

Use holdouts to validate your experimentation program

A small, stable group that does not receive flagged changes can help you measure cumulative lift from many experiments over time. To learn more, read Holdouts.

You can also send data to and run experiments with external warehouses

You can send all of your experiment data to an external warehouse, including BigQuery, Databricks, Redshift, or Snowflake, using a warehouse Data Export integration. By exporting your LaunchDarkly experiment data to the same warehouse as your other data, you can build custom reports to answer product behavior questions. To learn more, read Warehouse Data Export.

You can also run experiments using warehouse native metrics. To learn more, read Creating experiments using warehouse native metrics.