Applications of cloud computing are increasing as companies shift from on-premise IT environments to public, private, or hybrid clouds. Consequently, cloud providers use capacity planning to maintain the capacity of computing resources (instances) required to meet the dynamic nature of the demand (queries). However, there is a trade-off between deploying too many costly instances, and deploying too few instances and paying penalties for not being able to process queries on-time. An instance has multiple resource dimensions and executing a query consumes multiple dimensions of an instance’s capacity. This detailed multi-dimensional management of cloud computing resource capacity is known as elasticity management and is an important issue faced by all cloud providers. Determining the optimal number of instances needed in a given planning horizon is challenging, due to the combinatorial nature of the optimization problem involved. We develop an optimization model and related algorithms to capture the trade-off between the resource cost versus the delayed execution penalty in software as a service applications from the cloud provider’s perspective. We develop an exact approach to solve small to medium sized applications and heuristics to solve large applications. We then evaluate their performance via extensive computational analyses with real-world data and current cloud provider approaches. We also develop a stochastic framework and methodology to deal with demand uncertainty, and using two different randomly generated data sets (representing problem instances in practice), we demonstrate that robust solutions can be obtained.