In-network content caching has been deployed in both the Internet and cellular networks to reduce content-access delay. We investigate the problem of developing optimal joint routing and caching policies in a network supporting in-network caching with the goal of minimizing expected content-access delay. Here, needed content can either be accessed directly from a back-end server (where content resides permanently) or be obtained from one of multiple in-network caches. To access content, users must thus decide whether to route their requests to a cache or to the back-end server. In addition, caches must decide which content to cache. We investigate two variants of the problem, where the paths to the back-end server can be considered as either congestion-sensitive or congestion-insensitive, reflecting whether or not the delay experienced by a request sent to the back-end server depends on the request load, respectively. We show that the problem of optimal joint caching and routing is NP-complete in both cases. We prove that under the congestioninsensitive delay model, the problem can be solved optimally in polynomial time if each piece of content is requested by only one user, or when there are at most two caches in the network. We also identify the structural property of the user-cache graph that makes the problem NP-complete. For the congestion-sensitive delay model, we prove that the problem remains NP-complete even if there is only one cache in the network and each content is requested by only one user. We show that approximate solutions can be found for both cases within a (1 - 1/e) factor from the optimal, and demonstrate a greedy solution that is numerically shown to be within 1% of optimal for small problem sizes. Through trace-driven simulations, we evaluate the performance of our greedy solutions to joint caching and routing, which show up to 50% reduction in average delay over the solution of optimized routing to least recently used caches.