LLMOps
The operational discipline for keeping generative AI systems reliable, safe, and cost-controlled after they go live.
LLMOps — large language model operations — is the set of practices for running generative AI systems reliably after they go live. Where traditional software versioning tracks code changes, LLMOps tracks what else can shift: the prompts that shape model behavior, the retrieval sources it draws from, the model itself when a provider updates it, and the guardrails meant to prevent unsafe outputs. A generative AI system has more moving parts than most deployed software, and many of those parts are outside the organization's direct control.
Launching a generative AI system is not the hard part — keeping it working is. Prompts drift, retrieval quality degrades as documents go stale, model providers push updates that change behavior without notice, and costs scale unexpectedly when usage grows. Organizations that treat a generative AI launch as a one-time deployment rather than an ongoing service tend to discover these problems through failures: a customer-facing assistant hallucinating outdated information, a sensitive use case suddenly producing unsafe outputs after a model update, or infrastructure costs that weren't budgeted. LLMOps is what prevents each of those discoveries from becoming a crisis.
Read next
Related concepts
Large Language Models
The AI models behind most generative tools today — capable of remarkable language tasks, and unreliable about facts they were never trained on.
Generative AIRetrieval-Augmented Generation
RAG connects a generative AI model to your organization's documents so it answers from what you actually know, not just what the model was trained on.
Technical ConceptsVector Databases
The search infrastructure behind AI that retrieves by meaning — how a system finds the right document even when the user didn't use the exact right words.
Optional map
Concept neighborhood
Focused neighborhood
LLMOps
The operational discipline for keeping generative AI systems reliable, safe, and cost-controlled after they go live.
In these paths
Selected concept
Directly related
One step further
via Large Language Models
via Retrieval-Augmented Generation
via Vector Databases
via Model Monitoring