In computer architecture, caches have primarily been viewed as a means to hide memory latency from the CPU. Cache policieshave focused on anticipating the CPU's data needs, and are mostly oblivious to the main memory. In this paper, we demonstrate that the era of many-core architectures has created new main memory bottlenecks, and mandates a new approach: coordination of cache policy with main memory characteristics. Using the cache for memory optimization purposes, we propose a Virtual Write Queue which dramatically expands the memory controller's visibility of processor behavior, at low implementation overhead. Through memory-centric modification of existing policies, such as scheduled writebacks, this paper demonstrates that performance limiting effects of highly-threaded architectures can be overcome. We show that through awareness of the physical main memory layout and by focusing on writes, both read and write average latency can be shortened, memory power reduced, and overall system performance improved. Through full-system cycle-accurate simulations of SPEC cpu2006, we demonstrate that the proposed Virtual Write Queue achieves an average 10.9% system-level throughput improvement on memory-intensive workloads, along with an overall reduction of 8.7% in memory power across the whole suite. Copyright 2010 ACM.