FReaC cache: Folded-logic reconfigurable computing in the last level cache

Ashutosh Dhar; Xiaohao Wang; Hubertus Franke; Jinjun Xiong; Jian Huang; Wen-Mei Hwu; Nam Sung Kim; Deming Chen

doi:10.1109/MICRO50266.2020.00021

MICRO 2020

Conference paper

01 Oct 2020

FReaC cache: Folded-logic reconfigurable computing in the last level cache

View publication

Abstract

The need for higher energy efficiency has resulted in the proliferation of accelerators across platforms, with custom and reconfigurable accelerators adopted in both edge devices and cloud servers. However, existing solutions fall short in providing accelerators with low-latency, high-bandwidth access to the working set and suffer from the high latency and energy cost of data transfers. Such costs can severely limit the smallest granularity of the tasks that can be accelerated and thus the applicability of the accelerators. In this work, we present FReaC Cache, a novel architecture that natively supports reconfigurable computing in the last level cache (LLC), thereby giving energy-efficient accelerators low-latency, high-bandwidth access to the working set. By leveraging the cache's existing dense memory arrays, buses, and logic folding, we construct a reconfigurable fabric in the LLC with minimal changes to the system, processor, cache, and memory architecture. FReaC Cache is a low-latency, low-cost, and low-power alternative to off-die/offchip accelerators, and a flexible, and low-cost alternative to fixed function accelerators. We demonstrate an average speedup of 3X and Perf/W improvements of 6.1X over an edge-class multi-core CPU, and add 3.5% to 15.3% area overhead per cache slice.

Conference paper