Elastic Indexes: Dynamic Space vs. Query Efficiency Tuning for In-Memory Database Indexing
Abstract
In-memory database management systems (DBMSs) are important elements of data pipelines, wherein they store data produced over a sliding (or periodic) time window by real-time data sources for analytics and monitoring purposes. These pipelines produce vastly different amounts of data across time windows, dictating that some memory over-provisioning is required for the DBMS. A major source of DBMS memory overhead—which stresses memory provisioning demands—is storage of indexed keys by the DBMS index data structures. However, such internal-key storage is crucial for fast scans, which are important in these workloads. This paper proposes an elastic index design framework, which transforms an index with internal-key storage into an elastic index that adjusts its memory overhead to the data size. Under typical conditions, the elastic index offers the same performance as the original index. moshikthe DBMS does not exceed its memory budget. Shrinking is performed by dynamically converting index nodes to a compact representation with indirect key storage. We demonstrate our design with an elastic B+-tree, whose compact nodes use a novel blind trie representation. We show that the elastic B+-tree can store 2×–5× the number of keys (depending on key size) than a B+-tree with only moderate (< 25%) degradation of index throughput. We also integrate the elastic B+-tree into the MCAS in-memory storage system. Compared to MCAS’s default B+- tree index, the elastic B+-tree can consume as much as 3× less space with 1.8%–2.6% throughput degradation on a workload modeling analysis and monitoring of cloud log data.