Content cloud systems, e.g. CloudFront  and CloudBurst , in which content items are retrieved by end-users from the edge nodes of the cloud, are becoming increasingly popular. The retrieval latency in content clouds depends on content availability in the edge nodes, which in turn depends on the caching policy at the edge nodes. In case of local content unavailability (i.e., a cache miss), edge nodes resort to source selection strategies to retrieve the content items either vertically from the central server, or horizontally from other edge nodes. Consequently, managing the latency in content clouds needs to take into account several interrelated issues: asymmetric bandwidth and caching capacity for both source types as well as edge node heterogeneity in terms of caching policies and source selection strategies applied. In this paper, we study the problem of minimizing the retrieval latency considering both caching and retrieval capacity of the edge nodes and server simultaneously. We derive analytical models to evaluate the content retrieval latency under two source selection strategies, i.e., Random and Shortest-Queue, and three caching policies: selfish, collective, and a novel caching policy that we call the adaptive caching policy. Our analysis allows the quantification of the interrelated performance impacts of caching and retrieval capacity and the exploration of the corresponding design space. In particular, we show that the adaptive caching policy combined with Shortest-Queue selection scales well with various network configurations and adapts to the load changes in our simulation and analytical results. © 2011 IEEE.