Remove AI Networking Remove Protocol Remove Topology
article thumbnail

A RoCE network for distributed AI training at scale

Engineering at Meta

AI networks play an important role in interconnecting tens of thousands of GPUs together, forming the foundational infrastructure for training, enabling large models with hundreds of billions of parameters such as LLAMA 3.1 The growing prevalence of AI has introduced a new era of communication demands.

Network 132