Bandwidth, Networking and Topology - IT Networking Pro Today

A RoCE network for distributed AI training at scale

Engineering at Meta

AUGUST 5, 2024

AI networks play an important role in interconnecting tens of thousands of GPUs together, forming the foundational infrastructure for training, enabling large models with hundreds of billions of parameters such as LLAMA 3.1 Distributed training, in particular, imposes the most significant strain on data center networking infrastructure.

Network

Network Networking Topology Data Centers

Certification Internet service via iPerf3

Network Engineering

JANUARY 8, 2025

Occasionally, customers report issues such as high latency or not achieving their subscribed bandwidth. To address these concerns, we certify the last-mile connection using iPerf3 for traffic and bandwidth analysis. Attached is a topology diagram illustrating the proposed setup.

Internet

Internet Bandwidth Topology Server

Practical Steps for Enhancing Reliability in Cloud Networks - Part I

Kentik

APRIL 4, 2023

When evaluating solutions, whether to internal problems or those of our customers, I like to keep the core metrics fairly simple: will this reduce costs, increase performance, or improve the network’s reliability? It’s often taken for granted by network specialists that there is a trade-off among these three facets. Durability.

Cloud

Cloud Network Networking Bandwidth

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Using Chakra execution traces for benchmarking and network performance optimization

Engineering at Meta

SEPTEMBER 7, 2023

Meta presents Chakra execution traces , an open graph-based representation of AI/ML workload execution, laying the foundation for benchmarking and network performance optimization. At Meta, our endeavors are not only geared towards pushing the boundaries of AI/ML but also towards optimizing the vast networks that enable these computations.

Network

Network Networking Topology Bandwidth

Sustainable Networks: powering the future, responsibly

Juniper

MARCH 2, 2025

Official Juniper Networks Blogs Sustainable Networks: powering the future, responsibly Imagine a data center humming with 100,000 cutting-edge GPUs, the backbone of the AI/ML and Gen AI revolution. Though reliable, this approach is increasingly at odds with todays sustainability goals.

CCNA, CCNP & Firewall Interview Questions 2025: A Complete Networking Guide

NW Kings

FEBRUARY 28, 2025

As we progress into 2025, the landscape of networking continues to evolve rapidly, with new technologies, protocols, and security measures shaping the way organizations design and manage their networks. CCNA Interview Questions The CCNA certification serves as a foundational credential for network engineers.

Firewall

Firewall IP Address Network Networking

Network observability: Hype or reality?

Kentik

AUGUST 30, 2021

If you haven’t yet heard the term “network observability,” you will be hearing it soon. Some say that network observability is just marketing hype from vendors. They say, “networks have always been observable, so there’s nothing new here.” I say network observability is not just vendor hype, and this blog will make the case.

Network

Network Networking DevOps Application

Why latency is the new outage

Kentik

SEPTEMBER 20, 2021

Latency is not a new problem in networking. Not as difficult as time travel, but it’s difficult enough so that for 30+ years IT professionals have tried to skirt the issue by adding more bandwidth between locations or by rolling out faster routers and switches. Fast forward to today, most networks enjoy five nines (99.999%) of uptime.

TCP

TCP Routers Bandwidth IP Address

Securing Your Network Against Attacks: Prevent, Detect, and Mitigate Cyberthreats

Kentik

MARCH 15, 2023

As networks become distributed and virtualized, the points at which they can be made vulnerable, or their threat surface , expands dramatically. This is compounded by recent trends of remote work, where network operators need to wrestle with the fact that employees often access the network via work sites with far less governance.

Network

Network Networking Protocol IP Address

How Meta trains large language models at scale

Engineering at Meta

JUNE 12, 2024

Supporting GenAI at scale has meant rethinking how our software, hardware, and network infrastructure come together. Solving this problem requires a robust and high-speed network infrastructure as well as efficient data transfer protocols and algorithms. requires revisiting trade-offs made for other types of workloads.

Infiniband

Infiniband Data Centers Topology Network

The Network Also Needs to be Observable, Part 3: Network Telemetry Types

Kentik

JANUARY 28, 2021

In part 2 of this series, I talked about the range of network devices and observation points that generate telemetry data. Over time, this range has expanded, and networks are more diverse than ever. Why is my bandwidth bill so high? What users and applications are consuming my network bandwidth? Telemetry Types.

Network

Network Networking DNS IP Address

Building Meta’s GenAI Infrastructure

Engineering at Meta

MARCH 12, 2024

We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads. Network At Meta, we handle hundreds of trillions of AI model executions per day. We use this cluster design for Llama 3 training.

Infiniband

Infiniband Data Centers Server Network

Today’s Enterprise WAN Isn’t What It Used To Be

Kentik

MARCH 13, 2023

Whether it’s as simple as ensuring solid connectivity with a SaaS provider or designing a robust, secure, hybrid, and multi-cloud architecture, the enterprise wide area network is all about connecting us to our resources, wherever they are. So what does this mean for today’s enterprise network engineer?

WAN

WAN Wide Area Network Topology Internet

Why is Cisco ACI replacing traditional networks?

The Network DNA

JUNE 10, 2024

Why is Cisco ACI replacing traditional networks? Companies are increasingly moving from traditional networks to SDN-based networks. spine-leaf topologies provide excessive-bandwidth, low-latency, non-blocking server-to-server connectivity. Why is Cisco ACI replacing traditional networks? What is Cisco ACI ?

Network

Network Networking Virtual machine IP Address

When Reliability Goes Wrong in Cloud Networks

Kentik

MAY 31, 2023

In the first part of this series , I introduced network reliability as a concept foundational to success for IT and business operations. I also pointed out that because of necessary factors like redundancy, the pursuit of reliability will inevitably mean making compromises that affect a network’s cost or performance.

Network

Network Networking Cloud Routers

SNMP vs. NetFlow

Kentik

JANUARY 29, 2020

There is a lot of confusion regarding the two primary data sets in network management: SNMP and flow. SNMP is used to collect metadata and metrics about a network device. This critical technology is a basic building block of modeling, measuring, and understanding the network. What is SNMP? What is Flow? When is SNMP Used?

Port

Port Network Networking Data Centers

Why is my SaaS application so slow?

Kentik

DECEMBER 13, 2021

Check your local network. If you’re working from home, could it be possible that you’re competing with other local devices for that precious, limited bandwidth? excessive browser extensions, unknown devices on your network, patches that should be applied, etc.). The problem may not be yours to solve.

Application

Application DNS Routers Server

SNMP vs. Flow

Kentik

JANUARY 29, 2020

There is a lot of confusion regarding the two primary data sets in network management: SNMP and flow. SNMP is used to collect metadata and metrics about a network device. This critical technology is a basic building block of modeling, measuring, and understanding the network. What is SNMP? What is Flow? When is SNMP Used?

Port

Port Network Networking Data Centers

On-Prem Datacenter to AWS Cloud Connectivity

The Network DNA

OCTOBER 8, 2024

On-Prem Datacenter to AWS Cloud Connectivity Amazon Virtual Private Cloud (Amazon VPC) is a networking solution that allows you to set boundaries around your AWS resources. This isolated area allows you to launch resources in a virtual network that you have defined. Load balancers are important part of the Network ?

Cloud

Cloud Gateway Internet VPN

Traditional WAN vs. SD-WAN: Everything You Need to Know

CATO Networks

AUGUST 22, 2023

The WAN needs to offer high-performance and reliable network connectivity to ensure all users and applications can communicate effectively. These WAN routers defined the network boundaries and routed traffic to the appropriate destination. This allows the use of public internet for transport, which reduces networking costs.

WAN

WAN Bandwidth Data Centers Routers

How to Configure Static Routes on Cisco

NW Kings

JANUARY 7, 2025

Static routes are fundamental components in networking that help direct traffic efficiently from one network segment to another. We will also delve into practical examples and scenarios where static routing is preferred, giving you a thorough understanding of this essential networking concept. What is a Static Route?

Routers

Routers IP Address Protocol Topology

The business case for SD-WAN: Because MPLS is Not Fit for the Cloud

CATO Networks

NOVEMBER 15, 2017

That means making sure the wide area network (WAN) that connects branch offices, data centers, cloud services and SaaS applications can handle the connectivity needs of digitally empowered global organizations. Multiprotocol label switching protocol (MPLS) based networks, can no longer answer the business needs of a global enterprise.

MPLS

MPLS WAN Cloud Wide Area Network

Built-In Multi-Region Replication with Confluent Platform 5.4-preview

Confluent

SEPTEMBER 16, 2019

However, in order to operate a reliable stretch cluster, datacenters must be relatively close to each other and have a very stable, low latency, and high-bandwidth connection among the DCs. In a multi-datacenter cluster, network ingress and egress can be very costly—certainly more costly than network traffic within a datacenter.

Bandwidth

Bandwidth WAN Topology Networking

Independent Compliance and Security Assessment – Two Additions to the All-New Cato Management Application

CATO Networks

DECEMBER 14, 2021

New Topology View and a New Backend The top-level topology view has been redesigned to accommodate deployments of thousands of sites and tens of thousands of users. Figure 1 Catos new Management Application lets enterprises continue to manage their network, security, and access infrastructure from a common interface (1).

Application

Application Topology Cloud Bandwidth

IT Managers: Read This Before Leaving Your MPLS Provider

CATO Networks

APRIL 20, 2022

Maybe youre an IT manager or a network engineer. Youve been told to cut costs Its no secret that MPLS circuits cost a fortune often 3-4x the price of MPLS alternatives (like SD-WAN,) for only a fraction of the bandwidth. Its about a year before your MPLS contract expires, and youve been told to cut costs by your CFO.

MPLS

MPLS WAN SASE Networking

SD-WAN and Cloud Security

CATO Networks

MAY 6, 2018

No longer an emerging technology, cloud computing is now used in everything from applications, storage, and networking. SD-WAN is used to connect enterprise networks over large geographic distances more efficiently across any available data transport, such as MPLS, LTE, or broadband.

WAN

WAN Cloud Wide Area Network MPLS

Internet Underlay Visibility is Critical for SD-WAN Overlays

Kentik

MARCH 26, 2018

SD-WANs are the confluence of four technology trends: software-defined networking in wide area networks (WANs), commodity hardware for customer premise equipment, Internet connectivity for business applications, and enterprise IT hybrid multi-cloud migration. The benefits of SD-WANs are compelling both economically and operationally.

WAN

WAN Internet Wide Area Network MPLS

What is SD-WAN?

CATO Networks

AUGUST 7, 2018

This means that corporate networks must change as well. The answer Software-Defined Wide Area Networks (SD-WANs). SD-WAN brings unparalleled agility and cost savings to networking. SD-WAN does this by separating applications from the underlying network services with a policy-based, virtual overlay. How Does SD-WAN Work?

WAN

WAN MPLS Wide Area Network Data Centers

Engineering dependability and fault tolerance in a distributed system

High Scalability

FEBRUARY 19, 2021

The mechanisms described above — such as the role placement algorithm — can only be effective when all of the participating entities are in agreement on the topology of the cluster together with the status and health of each node. For example, there is a mechanism to manage the relocation of roles when the topology changes.

Engineering

Engineering Topology Protocol Networking

Top Ten Technology Trends for 2024

Vedcraft

JUNE 16, 2024

Team Topologies approach to organizing software engineering teams has emerged as a great reference for building an effective platform engineering team. Click here to see the consolidated list of tools & technologies.

Cloud

Cloud Engineering Application Data Centers

Building and deploying MySQL Raft at Meta

Engineering at Meta

MAY 16, 2023

MySQL Raft replication topologies A Raft ring would consist of several MySQL instances (four in the diagram) in different regions. Once in a while, automation could also change the regional placement of MySQL topology. The communication round-trip time (RTT) between these regions would range from 10 to 100 milliseconds.

Engineering

Engineering Protocol Server Topology

Choosing an SD-WAN Architecture for Real-Time Communications

CATO Networks

SEPTEMBER 13, 2017

These solutions rely on the Internet, MPLS, or some other third-party network for connecting locations. These solutions can use MPLS in hybrid configurations, but they also bring a private interconnect their own network of Points of Presence (POPs) that manages the traffic flow across the middle-mile.

WAN

WAN MPLS Media Cloud

IT Networking Pro Today

A RoCE network for distributed AI training at scale

Certification Internet service via iPerf3

Webinars

Trending Sources

Practical Steps for Enhancing Reliability in Cloud Networks - Part I

Webinars

Using Chakra execution traces for benchmarking and network performance optimization

Sustainable Networks: powering the future, responsibly

CCNA, CCNP & Firewall Interview Questions 2025: A Complete Networking Guide

Network observability: Hype or reality?

Why latency is the new outage

Securing Your Network Against Attacks: Prevent, Detect, and Mitigate Cyberthreats

How Meta trains large language models at scale

The Network Also Needs to be Observable, Part 3: Network Telemetry Types

Building Meta’s GenAI Infrastructure

Today’s Enterprise WAN Isn’t What It Used To Be

Why is Cisco ACI replacing traditional networks?

When Reliability Goes Wrong in Cloud Networks

SNMP vs. NetFlow

Why is my SaaS application so slow?

SNMP vs. Flow

On-Prem Datacenter to AWS Cloud Connectivity

Traditional WAN vs. SD-WAN: Everything You Need to Know

How to Configure Static Routes on Cisco

The business case for SD-WAN: Because MPLS is Not Fit for the Cloud

Built-In Multi-Region Replication with Confluent Platform 5.4-preview

Independent Compliance and Security Assessment – Two Additions to the All-New Cato Management Application

IT Managers: Read This Before Leaving Your MPLS Provider

SD-WAN and Cloud Security

Internet Underlay Visibility is Critical for SD-WAN Overlays

What is SD-WAN?

Engineering dependability and fault tolerance in a distributed system

Top Ten Technology Trends for 2024

Building and deploying MySQL Raft at Meta

Choosing an SD-WAN Architecture for Real-Time Communications

Stay Connected