Taming the tail utilization of ads inference at Meta scale
Engineering at Meta
JULY 10, 2024
The inference platforms that serve the sophisticated machine learning models used by Meta’s ads delivery system require significant infrastructure capacity across CPUs, GPUs, storage, networking, and databases. The solution involved considering memory bandwidth as a resource during replica placement in Shard Manager.
Let's personalize your content