Existing methods for visual anomaly detection predominantly rely on global level pixel comparisons for anomaly score computation without emphasizing on unique local features. However, images from real-world applications are susceptible to unwanted noise and distractions, that might jeopardize the robustness of such anomaly score. To alleviate this problem, we propose a self-supervised masking method that specifically focuses on discriminative parts of images to enable robust anomaly detection. Our experiments reveal that discriminator's class activation map in adversarial training evolves in three stages and finally fixates on the foreground location in the images. Using this property of the activation map, we construct a mask that suppresses spurious signals from the background thus enabling robust anomaly detection by focusing on local discriminative attributes. Additionally, our method can further improve the accuracy by learning a semi-supervised discriminative classifier in cases where a few samples from anomaly classes are available during the training. Experimental evaluations on four different types of datasets demonstrate that our method outperforms previous state-of-the-art methods for each condition and in all domains.