
Thursday, October 28, 2021
Elasticity across the Facebook Cloud
Facebook operates an internal cloud to support its family of products. The strategy for scaling has included an investment in elasticity of capacity management. Elasticity means several things. At the physical infrastructure level, we mobilize buffer capacity to mitigate dynamic unavailability and coordinate maintenances. At the workload management level, we AutoScale capacity allocations based on predictive and real-time models of workload demand. At the global resource management level, we time-shift flexible workloads based on time series models of supply availability. And at the global efficiency level, we leverage spare capacity for opportunistic workloads. During this talk, we will dive into each dimension and how they fit together. We will show how elasticity across the stack allows us to meet a high bar on reliability and availability, while making efficient use of all capacity deployed.
Download these images to your phone and post using the Instagram app.