We study the problem of operating a quantum switch with memory constraints. In particular, the switch has to allocate quantum memories to clients to generate link-level entanglements (LLEs), and then use these to serve end-to-end entanglements requests. The paper's main contributions are (i) to characterize the switch's capacity region, and (ii) to propose a memory allocation policy (MEW) that is throughput optimal. The worst-case time complexity of MEW is exponential on the system parameters. However, when the requests are bipartite and the LLE attempts are always successful, we propose a variant of MEW (MEW2) that has polynomial time complexity. We evaluate the proposed policies numerically and illustrate their performance depending on the requests arrivals characteristics and the time available to obtain a memory allocation.