Deploying agentic workflows in production is tough—bugs, hallucinations, and unexpected behavior can quickly turn a promising system into a support nightmare. But there’s a pattern we’ve seen across hundreds of companies: teams that embrace test-driven development (TDD) build stronger, more reliable AI systems. In this talk, Anita from Vellum will break down how TDD can be applied to AI agents, sharing real-world strategies for testing and improving reliability. She’ll also explore different types of agentic behavior, what’s possible to build today, and where the innovation is heading. To bring it all together, Anita will demo her own SEO agent—an agentic workflow that automates a big chunk of her content-writing process. If you're building AI-powered workflows and want them to actually work, this session is for you! Related links: DeepSeek-R1 training process: Agentic Workflows: Emerging architectures: Four pillars of building AI systems in production: Everything you need to know on Chain of Thought prompting: Reasoning models are indecisive parrots:











