Software-hardware managed last-level cache allocation scheme for large-scale NVRAM-based multicores executing parallel data analytics applications
Developments in machine learning and graph analytics have seen these fields establish themselves as pervasive in a wide range of applications. Non-volatile memory (NVRAM) offers higher capacity and information retainment in case of power loss, therefore it is expected to be adopted for such applications. However, the asymmetric access latencies of NVRAM greatly degrade performance. The focus of this paper is to reduce the effect of memory access latency on emerging machine learning and graph workloads. The proposed mechanism uses software tagging of application data structures so as to control on-chip cache evictions based on data type and reuse patterns in an NVRAM based multicore system. Learner models are developed that are capable of predicting cache allocations for a variety of machine learning and graph applications. The optimized learning model yields an average performance benefit of 21% compared to a system that does not optimize for the write latency challenges in NVRAM.