It is challenging for key-value data stores to trim user (tail) latency of requests as the workloads are observed to have skewed number of key-value pairs and commonly retrieved via multiget operation, i.e., all keys at the same time. In this paper we present Chisel, a novel client side solution to efficiently reshape the query size at the data store by adaptively splitting big requests into chunks to reap the benefits of parallelism and merge small requests into a single query to amortize latency overheads per request. We derive a novel layered queueing model that can quickly and approximately steer the decisions of Chisel. We extensively evaluate Chisel on memcached clusters hosted on a testbed, across a large number of scenarios with different workloads and system configurations. Our evaluation results show that Chisel can overturn the inherent high variability of requests into a judicious operational region, showcasing significant gains for the mean and 95th percentile of user perceived latency, compared to the state-of-art query processing policy.