Ship AI-built code with AgentControl or CodeControl

AI generates code faster than teams can validate it. This guide shows how to use LaunchDarkly to release AI-built changes safely. You get gradual rollouts with defined guardrails, automated detection, and instant remediation if something goes wrong, all without a redeploy.

Prerequisites

To complete this guide, you need the following:

How it works

The pattern is the same whether you ship AI-generated application code or deploy an AI coding agent:

  1. Wrap the change behind a feature flag or AgentControl config before it touches production.
  2. Define guardrails. These are the metrics and thresholds that tell LaunchDarkly when something is wrong.
  3. Roll out gradually while LaunchDarkly monitors and acts automatically if thresholds are crossed.

The section that applies to you depends on what you ship:

PathWhat you ship
CodeControlAgentControl
AI-generated application code, services, or infrastructure changesAI coding agents operating in production environments

CodeControl: shipping AI-generated code

This section explains how to ship AI coding agents safely.

Step 1: Wrap the change in a feature flag

Every AI-generated change that touches production should run behind a feature flag. Using feature flags increases safety because flags make automatic remediation possible. Without a flag, you cannot easily disable or switch to another version of your code in production.

To learn more, read Creating new flags.

Step 2: Define your guardrails

You can use metrics as guardrails that prevent or mitigate problems. Connect your flag to the metrics that matter for this change and set thresholds that will indicate if a problem occurs. In LaunchDarkly, configure these settings under your flag’s Guarded releases:

  • The metric to monitor, such as error rate, latency, conversion, or a custom business metric.
  • The threshold that triggers action.
  • The remediation response, such as halting the rollout, reverting the rollout audience to 0% of your users, or alerting an on-call team.

Step 3: Roll out and let the guardrails work

Start at a small percentage of traffic, such as 5% to 10%, rather than releasing to everyone at once. LaunchDarkly monitors your connected metrics as the rollout progresses. If a threshold is crossed, LaunchDarkly automatically executes the remediation action you defined in step 2.

You do not need to monitor the rollout yourself. By defining the metric guardrails and remediation thresholds in advance of the release, you ensure that corrective action occurs automatically if needed.

AgentControl: deploying AI coding agents

This section explains how to deploy AI coding agents safely.

Step 1: Configure prompts and model variations in AgentControl configs

Define your agent’s behavior, including prompts, model selection, and parameters, in AgentControl configs rather than hardcoding them. Connect evaluation metrics that reflect the outcomes you want, such as code quality scores, test pass rates, accuracy, or cost. To learn more, read AgentControl.

Step 2: Release and evaluate in production

Expose the agent to real workloads and monitor its behavior, including traces, outputs, and evaluation scores, across actual interactions. Use percentage rollouts here as well. Start narrow, then expand as evaluation scores confirm the agent performs as expected.

Step 3: Adapt or revert instantly

If evaluation scores degrade or error thresholds are crossed, LaunchDarkly will switch the active config. When this happens, traffic reroutes to a fallback model, reverts to a known-good prompt, or disables the agent path entirely. The change takes effect in milliseconds, without a redeploy.

Verify your safety net before you need it

Test your guardrails in a non-production environment before relying on them in production. Here’s how:

  1. Trigger a threshold violation artificially. Inject errors, degrade evaluation scores, or simulate latency.
  2. Confirm LaunchDarkly detects the threshold crossing and fires the configured remediation.
  3. Confirm your application responds to the flag or config change without a redeploy.

If remediation doesn’t fire, check that your metrics source is connected and reporting, and that flag evaluation uses the correct context.

Next steps

To continue, explore the following topics:

  • Guarded releases to configure automated rollback thresholds for code releases.
  • AgentControl for agent prompt and model runtime control.
  • Experimentation to measure AI-generated changes against a baseline before full rollout.
  • Targeting to control which users and environments are in scope for early rollouts.