Publication
ICPE 2014
Conference paper

Run-time performance optimization of a BigData query language

View publication

Abstract

JAQL is a query language for large-scale data that connects BigData analytics and MapReduce framework together. Also an IBM product, JAQL's performance is critical for IBM In-foSphere BigInsights, a BigData analytics platform. In this paper, we report our work on improving JAQL performance from multiple perspectives. We explore the parallelism of JAQL, profile JAQL for performance analysis, identify I/O as the dominant performance bottleneck, and improve JAQL performance with an emphasis on reducing I/O data size and increasing (de)serialization efficiency. With TPCH benchmark on a simple Hadoop cluster, we report up to 2x performance improvements in JAQL with our optimization fixes. Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Date

Publication

ICPE 2014

Authors

Share