We consider the problem of proactive retention aware caching in a heterogeneous wireless edge network consisting of mobile users accessing content from a server and associated to one or more edge caches. Our goal is to design a caching policy that minimizes the sum of content storage costs and server access costs over two design variables: the retention time of each cached content and the probability that a user routes content requests to each of its associated caches. We develop a model that captures multiple aspects such as cache storage costs and several capabilities of modern wireless technologies, such as server multicast/unicast transmissions, device multi-path routing, and cache access constraints. We formulate the problem of Proactive Retention Routing Optimization as a non-convex, non-linear mixed-integer program. We prove that it is NP-hard under both multicast/unicast modes - even when the caches have a large capacity and storage costs are linear - and develop greedy algorithms that have provable performance bounds for the case of uncapacitated caches. Finally, we propose heuristics with low computational complexity for the capacitated cache case as well as for the case of convex storage costs. Systematic evaluations based on real-world data demonstrate the effectiveness of our approach, compared to the existing caching schemes.