Remove Data Centers Remove Network Remove Port
article thumbnail

Watch Meta’s engineers discuss optimizing large-scale networks

Engineering at Meta

Managing network solutions amidst a growing scale inherently brings challenges around performance, deployment, and operational complexities. They present key ideas underpinning the FBOSS model that helped them build a stable and scalable network. non-blocking architecture).

article thumbnail

Seamless network integration: connecting OpenShift to your data center with Apstra

Juniper

Official Juniper Networks Blogs Seamless network integration: connecting OpenShift to your data center with Apstra In today’s fast-paced digital world, businesses demand agility andefficiency from their IT infrastructure. The most commonly deployed templates set up a cloud-scale EVPN-VXLAN fabric.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

A RoCE network for distributed AI training at scale

Engineering at Meta

AI networks play an important role in interconnecting tens of thousands of GPUs together, forming the foundational infrastructure for training, enabling large models with hundreds of billions of parameters such as LLAMA 3.1 Distributed training, in particular, imposes the most significant strain on data center networking infrastructure.

Network 124
article thumbnail

OCP Summit 2024: The open future of networking hardware for AI

Engineering at Meta

At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters. We’ve expanded our network hardware portfolio and are contributing two new disaggregated network fabrics and a new NIC to OCP. At Meta, we believe that open hardware drives innovation.

article thumbnail

Is spine and leaf the best architecture used by major hyperscaler (aws, gcp, azure) to handle huge east-west traffic [closed]

Network Engineering

Current best networking architecture used by major hyperscaler datacenters? How do they connect 50K or even 100K+ gpus (recently in XAI's datacenter by elon musk and his team) on a single network? How do they connect 50K or even 100K+ gpus (recently in XAI's datacenter by elon musk and his team) on a single network?

Port 130
article thumbnail

The Network Also Needs to be Observable, Part 2: Network Telemetry Sources

Kentik

In part 1 of this series , I talked about the importance of network observability as our customers define it — using advances in data platforms and machine learning to supply answers to critical questions and enable teams to take critical action to keep application traffic flowing.

Network 128
article thumbnail

Seamless network integration: connecting OpenShift to your data center with Apstra

Juniper

Official Juniper Networks Blogs Seamless network integration: connecting OpenShift to your data center with Apstra In today’s fast-paced digital world, businesses demand agility andefficiency from their IT infrastructure. The most commonly deployed templates set up a cloud-scale EVPN-VXLAN fabric.