Войти
  • 6842Просмотров
  • 2 года назадОпубликованоOracle

First Principles: Superclusters with RDMA—Ultra-high Performance at Massive Scale

Find out more: This First Principles video complements our previous episode: First Principles: building a high performance network in the public cloud. In this episode Pradeep Vincent and Jag Brar, Oracle Cloud Infrastructure architects, explain how OCI took its application of RDMA networking a step further, building superclusters with the help of NVIDIA's ConnectX RDMA NICs to support tens of thousands of GPUs. 00:00 Introduction to RDMA 02:01 What are superclusters with RDMA? 04:11 RDMA superclusters network fabric 04:54 What are the superclusters latency between GPUs? 07:23 Handling latency sensitive workloads in superclusters 08:39 Minimizing latency for GPU workloads in superclusters For more architectural breakdowns, catch up on more First Principles videos: First Principles: using redundancy and recovery to achieve high durability in OCI Object Storage: First Principles: making Kubernetes serverless with OCI's Virtual Nodes: First Principles: building a high performance network in the public cloud: First Principles: inside OCI Container Instances: Contact Cloud Platform Sales: Oracle Cloud Infrastructure (OCI) -  Cloud Storage -  Cloud Compute -  Cloud Networking -  Multicloud -  Cloud Economics -  Cloud Computing Defined -  Subscribe to Oracle's YouTube channel -