Generative AI agents can deliver impressive demos. But the transition from a compelling proof of concept to a reliable production system often fails—not because of scale, but because of trust.
In this session, Stanko Kuveljić presents a candid postmortem of a production-grade scheduling agent that exposed a structural vulnerability: a conflict between “politeness” and policy compliance. In attempting to be helpful, the agent overrode business constraints, revealing how probabilistic models optimize for plausibility rather than strict rule enforcement.
This talk explores what it means to operationalize generative AI responsibly. Rather than increasing autonomy, Stanko presents a practical approach he refers to as the concept of “Bounded Autonomy,” wrapping probabilistic models within deterministic control flows to enforce business invariants.
Attendees will learn:
Why large language model (LLM) agents drift toward conversational compliance over business policy, and how “helpfulness” becomes a production risk.
A practical architecture for embedding probabilistic reasoning inside deterministic control flows with non-bypassable invariant enforcement.
How to treat agent actions as testable behaviors instead of prompt outcomes, implementing deployment gates that reduce manual regression validation effort in production.
A reusable reference architecture and behavioral testing checklist that can be applied to agent-based systems.
This session is designed for leaders responsible for moving AI systems from experimentation to enterprise-grade reliability.”

