Part 2: How Is GenAI Transforming the Software Development Lifecycle?  featured image

In this four-part blog series (see part one here), we’ll cover how GenAI is transforming software delivery, the new challenges it introduces, and how LaunchDarkly can help teams build and deliver new GenAI features within a matter of hours, not weeks.

As we noted in the previous post in this series, businesses of all types are rapidly increasing the rate at which they’re building GenAI features or net-new products. But building GenAI features comes with a new set of considerations and challenges, from non-deterministic outputs to a rapidly shifting model landscape. First, let’s take a look at some of the main ways that building with GenAI differs from the traditional software development lifecycle (SDLC):

Planning

The simplest way to distinguish between GenAI and ‘Traditional’ software is that the latter operates on deterministic principles—inputs are predefined, and outputs are predictable. With GenAI, developers are building systems that deliver non-deterministic interactions to users. User input and the resulting output can vary widely, creating challenges when it comes to ensuring consistency and safety.

One of the biggest challenges for AI engineers is designing guardrails around these non-deterministic interactions.

Non-deterministic interactions are a feature, not a bug. But one of the biggest challenges for AI engineers is designing guardrails around these non-deterministic interactions. AI engineers must find ways to narrow and shape the creative output of large language models (LLMs) into predictable, controlled responses while still retaining the natural and human-like flexibility that users expect from these models. That means finding the optimal model configuration, paired with the right prompts to generate a desired outcome—repeatedly, and at scale. Striking this balance is crucial to minimize the likelihood of hallucinations—incorrect or misleading outputs—without overly constraining the models’ creative capabilities.

Design

New models and strategies are redefining the state of the art of building GenAI software on a regular basis, so teams need to optimize for architecture patterns capable of including new model configurations, prompt strategies, and augmentation strategies. 

Should AI engineers attempt to build custom models tailored specifically for their domain, or utilize Retrieval-Augmented Generation (RAG) to improve their LLM performance by retrieving relevant data from external sources? Both approaches have pros and cons, but developers need to be mindful of the trade-offs. Custom models may offer better alignment with specific business needs, but RAG may offer more agility by pulling in real-time, updated information.

Implementation

Iterating on prompts and adjusting model configurations can also become a manual, time-intensive process. In traditional software development, iterating involves code changes, redeployment, and testing. In GenAI development, this work is compounded by the need for constant tweaking of model configurations and prompts, increasing the friction around implementing new and updated features. In addition, the need for continuous improvement and adjustment of GenAI software makes the testing, deployment, and maintenance phases more compressed and cyclical than with traditional software. 

Testing

In the traditional SDLC, testing is mostly deterministic—you can verify that a system is working as expected by defining a repeatable set of inputs or scenarios and validating that the expected (deterministic) output is returned. But GenAI introduces a new dimension to testing. Developers are not only testing for technical performance (e.g., response times, system load) but also for subjective measures like tone, helpfulness, and scope of knowledge.

Developers now need to test for how ‘natural’ the output feels, how aligned it is with brand, and whether it provides helpful or relevant information to the user.

How do you measure whether a chatbot has the right mix of friendliness and expertise? With  GenAI, testing becomes both an art and a science.  Developers now need to test for how ‘natural’ the output feels, how aligned it is with the company’s brand and values, and whether it provides relevant information to the user. These metrics are inherently less objective, adding complexity to the testing phase. Tools that specialize in testing for these subjective qualities are still evolving, leaving development teams to rely on manual processes or semi-automated solutions.

Deployment 

Non-deterministic output can introduce new and dangerous user experience risks. The failure state of a chatbot gone awry has the potential to create an even worse user experience than ‘traditional’ software failing to return a deterministic output. Developers need to plan for scenarios where things could go wrong, incorporating fallback mechanisms to ensure that non-deterministic output is both accurate and relevant to users’ needs.

Maintain 

GenAI is also pushing the boundaries of traditional SDLC due to the rapid pace of innovation in the AI space. Over the past six months alone, numerous new models have emerged.Organizations must take steps to stay current, often struggling to properly optimize and evaluate models before the next ones are released. The rate of change with GenAI means that teams may need to update their models and prompt strategies regularly to keep up with the competition.

Conclusion

We’ve covered several ways that GenAI introduces new challenges into the traditional software development lifecycle. So what’s the best solution? The same principles of control and safety that make feature management a foundational best practice for high-velocity ‘traditional’ engineering teams also make it a best practice for teams building GenAI features. GenAI teams can use feature management to: 

  • Reduce the friction associated with the new GenAI SDLC by controlling prompt and model configurations using feature flags
  • Safeguard the end-user experience by making it possible to roll back to a safe state without redeploying if an issue occurs
  • Reduce the opportunity cost of testing and upgrading to a new model or strategy by making it easier to switch to new configurations
  • Experimenting with different model and prompt configurations to understand how to optimize the user experience and understand the business impact of new changes 

GenAI has created a paradigm shift in software development. Teams building AI applications must now account for non-determinism, which requires rethinking everything from how we test software to how we deploy it. As technological change accelerates, development teams will need to stay agile, embrace new testing and deployment methods, and find innovative ways to balance creativity with control. The fundamental principles of feature management will be critical for teams looking to deliver high-quality GenAI features quickly and safely, and LaunchDarkly is excited to support GenAI builders with upcoming features to support the new GenAI software development lifecycle.

In our next blog post, we’ll walk through some best practices for taking new models and prompt strategies to production using LaunchDarkly. In the meantime, try out LaunchDarkly’s AI model flags and AI prompt flags to keep up with the pace of AI innovation by introducing new models, prompts, and configurations at runtime, and rollback instantly in case there’s an issue.  

Like what you read?
Get a demo
Related Content

More about De-risked releases