Publication
MASCOTS 2019
Conference paper

ExaPlan Archive: Data placement and provisioning for large storage systems with archival tiers

View publication

Abstract

Many important big data use cases do not require data to be instantly available. Examples are video recordings in TV and film industry, surveillance videos and data from scientific experiments. Archiving such data to high-latency media storage, such as tape and optical disk libraries, results in significant cost savings. In this context, data is accessed by first staging it to low-latency media. However, archiving and staging operations incur additional device and bandwidth costs for both the active and archiving tiers, and might impact user data access performance. For instance, in terms of cost and performance, it is often suboptimal to archive all the data. This paper presents ExaPlan Archive, a scheme to determine the data placement and number of devices required in each tier of a multitiered storage system comprised of archival and active tiers that minimize the latency of the active tiers under budget and staging-time constraints. The efficiency of the proposed optimized archiving scheme is compared with an existing scheme that optimizes multitier storage with only direct-access tiers. The two schemes are evaluated using a staging workload of LOFAR radio telescopes long-term archive for astronomical observation data.

Date

01 Oct 2019

Publication

MASCOTS 2019

Authors

Share