Web applications have stringent performance requirements that are sometimes violated during periods of high demand due to lack of resources. Infrastructure as a Service (IaaS) providers have made it easy to provision and terminate compute resources on demand. However, there is a need for a control mechanism that is able to provision resources and create multiple instances of a web application in response to excess load events. In this paper, we propose and implement a reinforcement learning-based controller that is able to respond to volatile and complex arrival patterns through a set of simple states and actions. The controller is implemented within a distributed architecture that is able to not only scale up quickly to meet rising demand but also scale down by shutting down excess servers to save on ongoing costs. We evaluate this decentralized control mechanism using workloads from real-world use cases and demonstrate that it reduces SLA violations while minimizing cost of provisioning infrastructure.