PyTorch Expert Exchange Webinar: How does batching work on modern GPUs? with Finbarr Timbers, an AI researcher, who writes at Artificial Fintelligence and has worked at a variety of large research labs, including DeepMind and Midjourney Batch inference is the most basic optimization that you can do to improve GPU utilization. It is often overlooked and misunderstood because of how common it is. Here, we walk through why, exactly, batching works, and help you develop intuition for what exactly is going on inside your GPU.











