Error handling is an important part of any robust software system, but we often forget to consider error handling when designing operational automation systems. We frequently assume that our scripts and playbooks will always work, and we are left frustrated when failure creeps in and we aren't prepared to automatically handle it. In this video, we take a look at error handling in Ansible. We look at three dimensions of error handling: task-level failures, host-level failures, and play-level failures. We consider different ways to handle errors at each of these levels, and we explore Ansible's behavior when various error handling parameters are properly set. Playbook examples: VIDEO CHAPTERS 00:00 - Intro 00:14 - Video Overview 00:43 - Environment Setup 00:58 - Failure Considerations 01:28 - Playbook and Inventory Setup 02:50 - Defining Failure for a Task - Task Status 05:15 - Defining Failure for a Task - Timeouts 06:52 - Host-Level Failure - Ignoring Errors 09:26 - Host-Level Failure - Block, Rescue, and Always 14:49 - Host-Level Failure - Handlers and Failure 17:07 - Play-Level Failure - Default Behavior 19:09 - Play-Level Failure - Failing Play on Any Error 20:15 - Play-Level Failure - Failure Percentages 22:46 - Conclusion and Outro











