Federated learning has arisen as a mechanism to allow multiple participants to collaboratively train a model without sharing their data. In these settings, participants (workers) may not trust each other fully; for instance, a set of competitors may collaboratively train a machine learning model to detect fraud. The workers provide local gradients that a central server uses to update a global model. This global model can be corrupted when Byzantine workers send malicious gradients, which necessitates robust methods for aggregating gradients that mitigate the adverse effects of Byzantine inputs. Existing robust aggregation algorithms are often computationally expensive and only effective under strict assumptions. In this paper, we introduce LayerwisE Gradient AggregatTiOn (LEGATO), an aggregation algorithm that is, by contrast, scalable and generalizable. Informed by a study of layer-specific responses of gradients to Byzantine attacks, LEGATO employs a dynamic gradient reweighing scheme that is novel in its treatment of gradients based on layer-specific robustness. We show that LEGATO is more computationally efficient than multiple state-of-the-art techniques and more generally robust across a variety of attack settings in practice. We also demonstrate LEGATO's benefits for gradient descent convergence in the absence of an attack.