J Visual Commun Image Represent

Understanding Internet Video sharing site workload: A view from data center design

View publication


Internet Video sharing sites, led by YouTube, have been gaining popularity in a dazzling speed, which also brings massive workload to their service data centers. In this paper we analyze Yahoo! Video, the 2nd largest U.S. video sharing site, to understand the nature of such unprecedented massive workload as well as its impact on online video data center design. We crawled the Yahoo! Video web site for 46 days. The measurement data allows us to understand the workload characteristics at different time scales (minutes, hours, days, weeks), and we discover interesting statistical properties on both static and temporal dimensions of the workload including file duration and popularity distributions, arrival rate dynamics and predictability, and workload stationarity and burstiness. Complemented with queueing-theoretic techniques, we further extend our understanding on the measurement data with a virtual design on the workload and capacity management components of a data center assuming the same workload as measured, which reveals key results regarding the impact of workload arrival distribution, Service Level Agreements (SLAs), and workload scheduling schemes on the design and operations of such large-scale video distribution systems. © 2009 Elsevier Inc. All rights reserved.