Войти
  • 454Просмотров
  • 1 год назадОпубликованоLLVM

2024 LLVM Dev Mtg - Shardy: An MLIR-based Tensor Partitioning System for All Dialects

2024 LLVM Developers' Meeting ------ Shardy: An MLIR-based Tensor Partitioning System for All Dialects Speaker: ------ Slides: ----- Generative AI models are so large that the tensor programs they are represented as are required to be chunked (partitioned) into programs on thousands of hardware accelerators. Within Google DeepMind these models are being partitioned across TPU super clusters of over 4096 devices. In this presentation, we present a new MLIR tensor propagation system we have been developing and deploying to train these large AI models. We’ve defined our own dialect that expresses tensor shardings and compiler transformation rules as MLIR attributes. It is MLIR dialect agnostic, and has improved debugging capabilities and more configurability to the propagation algorithm over past systems. ----- Videos Edited by Bash Films: