Random data augmentation is a critical technique to avoid overfitting in training deep models. Yet, data augmentation and network training are often two isolated processes in most settings, yielding to a suboptimal training. Why not jointly optimize the two? We propose adversarial data augmentation to address this limitation. The key idea is to design a generator (e.g. an augmentation network) that competes against a discriminator (e.g. a target network) by generating hard examples online. The generator explores weaknesses of the discriminator, while the discriminator learns from hard augmentations to achieve better performance. A reward/penalty strategy is also proposed for efficient joint training. We investigate human pose estimation and carry out comprehensive ablation studies to validate our method. The results prove that our method can effectively improve state-of-the-art models without additional data effort.