MDDM: A method to improve multiple dimension data management performance in HBase

Zhuang Wei; J.M. Qu; Liang Liu; Chaoqiang Zhu; Wenjun Yin

doi:10.1109/HPCC-CSS-ICESS.2015.102

HPCC-ICESS-CSS 2015

Conference paper

23 Nov 2015

MDDM: A method to improve multiple dimension data management performance in HBase

View publication

Abstract

Big data is the term applied to a new generation of software, applications and storage system, designed to derive business values. The big data phenomenon requires a revolutionary approach to the technologies deployed to ensure that timely results are delivered to create value. However, the state-of-the-art techniques for multiple dimensions big data query are facing problems as the data expand and user access pattern changes. In this paper, we will propose an optimized storage model and index scheme to provide efficient query over big multiple dimension data and multiple query patterns. We implement our scheme on HBase by introducing four components in its master node. Taking pollutant concentration data in 'Green Horizon' project as the test data, we conduct numerous experiments. Experiment results show that our proposed storage model and index can help provide obvious performance improvement on multiple different queries patterns over big multiple dimension data and also has good scalability as data expand.

Conference paper