MapReduce plays an critical role in finding insights in Big Data. The performance optimization of MapReduce programs is challenging because it requires a comprehensive understanding of the whole system including both hardware layers (processors, storages, networks and etc), and software stacks (operating systems, JVM, runtime, applications and etc). However, most of the existing performance tuning and optimization are based on empirical and heuristic attempts. It remains a blank on how to build a systematical framework which breaks the boundary of multiple layers for performance optimization. In this paper, we propose a performance evaluation framework by correlating performance metrics from different layers, which provides insights to efficiently pinpoint the performance issue. This framework is composed of a series of predefined patterns. Each pattern indicates one or more potential issues. The behavior of a MapReduce program is mapped to the corresponding resource utilization. The framework provides a holistic approach which allows users at different levels of experience to conduct MapReduce program performance optimization. We use Terasort benchmark running on a 10-node Power7R2 cluster as a real case to show how this framework improves the performance. By this framework, we finally get the Terasort result improved from 47 mins to less than 8 mins. In addition to the best practice on performance tuning, several key findings are summarized as valuable workload analysis for JVM, MapReduce runtime and application design. © 2013 IEEE.