Compared to the original implementation of randomized smoothing, our loss objective only requires that we query the teacher’s output on the Gaussian noisy training sample. We found that using the robust teacher’s outputs to guide student training was sufficient to generate an accurate and robust student model. Querying the teacher during training is fast as it only involves a single forward pass through the teacher model. As such, our transfer learning framework is as fast as a non-robust training pipeline. Once a strong, robust model has been trained, our transfer learning framework allows for repeated generation of secure models without any additional training overhead.
In the paper, we demonstrate how our certified transfer learning approach mitigated the training overhead of SmoothMix, a state-of-the-art randomized smoothing defense, while maintaining the security of future model generations. Our transfer learning framework also remained effective across several generations, despite only performing robust training once. We also found that our framework can be used to accelerate certified training even when no robust model is available. We can just train a smaller, fast-to-train model with expensive certifiable training methods first and then apply our transfer learning framework.
Achieving high accuracy in adversarial scenarios is a very desirable goal, but we can’t forget about the other costs that go into training — especially time. For AI robustness, the research community has often focused on improving accuracy metrics at the cost of increasing training overheads. That’s fine for a paper — but for practical use, we need more. Our knowledge transfer framework is a response to a lack of effective and practical adversarial robustness defenses for industrial use cases. If we want to encourage model developers to create trustworthy AI models, we should focus on making each component, including adversarial robustness, easy to adopt.