The rapid expansion of cloud offerings poses fundamental tasks for workload management in a large scale server farm. In order to achieve satisfactory Quality of Service (QoS) and reduce operation cost, we present a fully distributed workload management system in a large scale server environment, e.g., cloud. Different from existing centralized control approaches, the workload management logic hierarchically spreads on each back-end server and front-end proxy. The control solution is designed to offer both overload protection and resource efficiency for the back-end servers, while achieving service differentiation based on Service Level Agreement (SLA). The proposed system can directly work with legacy software stack, because the implementation requires no changes to the target operating system, application servers, or web applications. Our evaluation shows that it achieves both overload protection and service classification under dynamic heavy workload. Furthermore, it also demonstrates negligible management overhead, satisfactory fault-tolerance and fast convergence. © 2012 Elsevier B.V. All rights reserved.