Upgrade OpenAI models in Python FastAPI applications—using LaunchDarkly AI configs featured image

If you are running an AI app in production, you want to give your users the latest and greatest features, while doing everything possible to ensure their experience is smooth and bug free. Decoupling deployments from AI configuration changes is one approach to reduce risk.

LaunchDarkly’s new AI configs (now in Early Access) can help! With AI configs, you can change your model or prompt at runtime, without needing to deploy any new code. Decoupling configuration changes from deployment grants developers a ton of flexibility to roll out changes quickly and smoothly.

In this tutorial, we’ll teach you to use AI configs to upgrade the OpenAI model version in a Python FastAPI application.

Our example app generates letters of reference, using a person’s name as an input parameter. Feel free to substitute another use case if you’ve got one in mind.

Prerequisites

  • An OpenAI API key. You can create one here.
  • A free LaunchDarkly account with AI configs enabled. In order to get access to AI configs, click on ‘AI configs’ in the left-hand navigation and click ‘Join the EAP’ and our team will turn on access for you. 
  • A developer environment with Python and pip installed

Setting up your developer environment

Set up and activate your virtual environment and install dependencies using these commands:

git clone https://github.com/annthurium/launchdarkly-ai-config-fastapi
cd launchdarkly-ai-config-fastapi
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Rename your .env.example file to .env. Copy your OpenAI API key into the .env file. Save the file. We’ll be setting up our LaunchDarkly SDK key later.

Start the server.

fastapi dev main.py

Load http://localhost:8000/ in your browser. You should see a UI for the student reference letter generator. The “generate” button won’t work until we hook up LaunchDarkly, so we’ll tackle that shortly. But first, let’s take a quick tour through the project’s stack.

Application architecture overview

What are the major components of this project?

main.py holds our routes and basic server logic:

from fastapi import FastAPI, Request
from fastapi.responses import HTMLResponse
from fastapi.staticfiles import StaticFiles
import uvicorn

app = FastAPI()

from openai_client import generate

@app.post("/generate")
async def generate_reference(request: Request):
   try:
       form_data = await request.json()
       student_name = form_data.get('studentName')
       result = generate(NAME=student_name)
       return {"success": True, "result": result}
   except Exception as e:
       return {"success": False, "error": str(e)}

@app.get("/", response_class=HTMLResponse)
async def read_root():
   with open("static/index.html") as f:
       return HTMLResponse(content=f.read())

# Mount static files directory
app.mount("/static", StaticFiles(directory="static"), name="static")

if __name__ == "__main__":
   uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)

openai_client.py calls the LLM to generate a response. LaunchDarkly’s AI SDK fetches the prompt and model data at runtime, which is then passed to the model.


With a call to tracker.track_openai_metrics, LaunchDarkly also records input tokens, output tokens, generation count, and optional user feedback about the response quality. The docs on tracking AI metrics go into more detail if you are curious.

from dotenv import load_dotenv
load_dotenv()

import os
import openai
import ldclient
from ldclient import Context
from ldclient.config import Config
from ldai.client import LDAIClient, AIConfig, ModelConfig

ldclient.set_config(Config(os.getenv("LAUNCHDARKLY_SDK_KEY")))

ld_ai_client = LDAIClient(ldclient.get())
openai_client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def generate(**kwargs):
   """
   Calls OpenAI's chat completion API to generate some text based on a prompt.
   """
   context = Context.builder('example-user-key').kind('user').name('Sandy').build()
   try:
       ai_config_key = "model-upgrade"
       default_value = AIConfig(
       enabled=True,
       model=ModelConfig(name='gpt-4o'),
       messages=[],
       )
       config_value, tracker = ld_ai_client.config(
       ai_config_key,
       context,
       default_value,
       kwargs
   )
       print("CONFIG VALUE", config_value)
       messages = [] if config_value.messages is None else config_value.messages
       completion = tracker.track_openai_metrics(
           lambda:
               openai_client.chat.completions.create(
                   model=config_value.model.name,
                   messages=[message.to_dict() for message in messages],
               )
       )
       response = completion.choices[0].message.content
       print("Success.")
       print("AI Response:", response)
       return response

   except Exception as e:
       print(e)

Our front end lives in the static folder. We’re using vanilla HTML/CSS/JavaScript here since the front end state is fairly minimal. 

Onwards to configure our first, well, config.

Creating an AI config

Go to the LaunchDarkly app

Click on the Create button. Select AI config from the menu.

Give your AI config a name, such as “model-upgrade.” Click Create.

Screenshot of the modal UI element where you can name an AI config. This example is named "model-upgrade."

Next, we’re going to add a variation to represent the older model. Variations are an abstraction that represents a specific combination of models and messages, plus other options such as temperature and tokens. Variations are editable, so it’s okay to make mistakes.

In your prompt messages, use double curly braces to surround variables. You can also use Markdown to format messages. You can see additional examples in the docs.

With the Variations tab selected, configure the first variation using the following options:

  • Name: gpt-4o
  • Model: gpt-4o
  • Role: system. “You are a helpful research assistant.”
  • Role: user. “Generate a reference letter for {{NAME}}, a 22 year old student at UCLA.”

Leave the max tokens and temperature as is.

Screenshot of the UI for creating an AI config variation in LaunchDarkly.

When you’re done, save changes.

Next, we’ll create the variation for the newer model. When rolling out any kind of upgrade, changing one thing at a time makes it easier to measure success. Thus, the messages which constitute our prompt are identical to the first variant. Click the Add another variation button at the bottom, just above the Save changes button. Then input the following information into the variation section:

  • Name: chatgpt-4o-latest
  • Model: chatgpt-4o-latest
  • Role: system. “You are a helpful research assistant.”
  • Role: user. “Generate a reference letter for {{NAME}}, a 22 year old student at UCLA.”

Screenshot demonstrating the configuration for a second AI config variant in the LaunchDarkly UI.

Again, save changes when you are done.

Like feature flags, AI configs must be enabled in order to start serving data to your application. Click on the Targeting tab, and then click the Test button to select that environment. (Alternatively, you could use Production, or any other environment that exists in your LaunchDarkly project.) 


Edit the Default rule dropdown to serve the gpt-4o variation. This older model is our baseline. We’ll create a progressive rollout to serve the newer model to more and more users if things look good. Click the toggle to turn the config on. When you’re done, click Review and save.

Screenshot demonstrating the default rule configuration for an AI config.

To authenticate with LaunchDarkly, we’ll need to copy the SDK key into the .env file in our Python project.


Select the … dropdown next to the Test environment. Select SDK key from the dropdown menu to copy it to the clipboard.

Screenshot demonstrating how to copy the LaunchDarkly SDK key for an AI config.

Open the .env file in your editor. Paste in the SDK key. When you paste it in, the SDK key should start with "sdk-" followed by a unique identifier. Save the .env file.


Reload http://localhost:8000/ and try putting your name into the reference letter generator. Enjoy the warm fuzzy feelings of your fake achievements.

Progressively upgrading to a newer model

Head back to the LaunchDarkly app. We’re going to use progressive rollouts to automatically release the new model to users, on a timeline we set up. 


On the Targeting tab, click Add rollout. Select Progressive rollout from the dropdown.

Screenshot demonstrating how to create a progressive rollout for a LaunchDarkly AI config.

The progressive rollout dialog has some sensible defaults. If I was upgrading a production API, I’d probably keep these as is and watch my dashboards carefully for regressions. 

If you’re testing or prototyping, you can remove some steps to speedrun through a rollout in 2 minutes.

Delete unwanted rollout stages by clicking on the ... menu, and selecting Delete

Your finished rollout should have these attributes:

  • Turn AI config on: On
  • Variation to roll out: chatgpt-4o-latest
  • Context kind to roll out by: user
  • Roll out to 25% for 1 minute
  • Roll out to 50% for 1 minute

Screenshot demonstrating configuration for a speedrun through a progressive rollout.

Click "Confirm" then "Review and Save" on the targeting page. Wait 2 minutes. Generate a new letter. Check your server logs to confirm the newer model is being used.

CONFIG VALUE:  AIConfig(enabled=True, model=<ldai.client.ModelConfig object at 0x106c68a70>, messages=[LDMessage(role='system', content='You are a helpful research assistant.'), LDMessage(role='user', content='Generate a reference letter for Yan, a 22 year old student at UCLA.')], provider=None)

MODEL NAME:  chatgpt-4o-latest

On the Monitoring tab of the AI configs panel, you can see the generation counts for each variation, along with input tokens, output tokens, and satisfaction rate.

Screenshot showing how AI config monitoring metrics appear in the LaunchDarkly UI.

The Progressive Rollout UI will also update to indicate that chatgpt-4o-latest is being served to 100% of users. 🎉

Screenshot showing a progressive rollout with the new variation being served to 100%.

Wrapping it up

If you’ve been following along, you have learned to use LaunchDarkly AI configs to manage runtime configuration for your FastAPI app. You’ve also learned how to perform a progressive rollout to upgrade to a newer model, while minimizing risks to your users. Well done!

This app is pretty basic. Some future upgrades to consider:

  • Tracking output satisfaction rate. Implementing a 👍/ 👎button your users can press to send answer quality metrics to LaunchDarkly. 
  • Prompt improvements. The prompt used here was cribbed from a research paper about gender bias and LLMs. Few-shot prompting, or providing examples of good reference letters to the model, could improve response quality. You could also request additional info from the end user (such as the student’s major or GPA) and pass those variables along to the LLM. It might also be worth testing different model/prompt combinations to see how they stack up on cost, latency, and accuracy. 
  • Advanced targeting. AI configs give you the flexibility to specify what variant a user should see, based on what you know about them. For example, you could serve the latest model to any users who have @yourcompany.com email addresses, for dogfooding purposes.

If you want to learn more about feature management for generative AI applications, here’s some further reading:


Thanks so much for reading. Hit me up on Bluesky if you found this tutorial useful. You can also reach me via email (tthurium@launchdarkly.com) or LinkedIn.

Like what you read?
Get a demo
Related Content

More about Migrations

January 15, 2025