The continuous growth in 3D-NAND flash storage density has primarily been enabled by 3D stacking and by increasing the number of bits stored per memory cell. Unfortunately, these desirable flash device design choices are adversely affecting reliability and latency characteristics. In particular, increasing the number of bits stored per cell results in having to apply additional voltage thresholds during each read operation, therefore increasing the read latency characteristics. While most NAND flash challenges can be mitigated through appropriate background processing, the flash read latency characteristics cannot be hidden and remains the biggest challenge, especially for the newest flash generations that store four bits per cell. In this paper, we introduce read heat separation (RHS), a new heat-aware data-placement technique that exploits the skew present in real-world workloads to place frequently read user data on low-latency flash pages. Although conceptually simple, such a technique is difficult to integrate in a flash controller, as it introduces a significant amount of complexity, requires more metadata, and is further constrained by other flash-specific peculiarities. To overcome these challenges, we propose a novel flash controller architecture supporting read heat-aware data placement. We first discuss the trade-offs that such a new design entails and analyze the key aspects that influence the efficiency of RHS. Through both, extensive simulations and an implementation we realized in a commercial enterprise-grade solid-state drive controller, we show that our architecture can indeed significantly reduce the average read latency. For certain workloads, it can reverse the system-level read latency trends when using recent multi-bit flash generations and hence outperform SSDs using previous faster flash generations.