In this deep dive, Elire’s Valentin Todorow demonstrates how to manage evaluation sets in Oracle AI Agent Studio. Evaluations provide a controlled way to test agent behavior before deployment by checking response accuracy, token usage, latency, and overall correctness. Valentin walks through creating evaluation sets, loading test questions, setting tolerance thresholds, running evaluations multiple times, and comparing results. The demo also shows how tracing reveals each tool call, LLM interaction, and response path, helping teams refine prompts and agent logic with confidence.











