Remove Bandwidth Remove Data Centers Remove Storage Networking
article thumbnail

Taming the tail utilization of ads inference at Meta scale

Engineering at Meta

The inference platforms that serve the sophisticated machine learning models used by Meta’s ads delivery system require significant infrastructure capacity across CPUs, GPUs, storage, networking, and databases. The solution involved considering memory bandwidth as a resource during replica placement in Shard Manager.

Server 97