Trust Regions for Explanations via Black-Box Probabilistic CertificationAmit DhurandharSwagatam Haldaret al.2024ICML 2024
Cookie Consent Has Disparate Impact on Estimation AccuracyErik MiehlingRahul Nairet al.2023NeurIPS 2023
Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant LearningAmit DhurandharKarthikeyan Natesan Ramamurthyet al.2023NeurIPS 2023
The Impact of Positional Encoding on Length Generalization in TransformersAmirhossein KazemnejadInkit Padhiet al.2023NeurIPS 2023
Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained ModelsSourya BasuPrasanna Sattigeriet al.2023AAAI 2023
Your Fairness May Vary: Pretrained Language Model Fairness in Toxic Text ClassificationIoana Baldini SoaresDennis Weiet al.2022ACL 2022
Ground-Truth, Whose Truth? - Examining the Challenges with Annotating Toxic Text DatasetsKofi ArhinIoana Baldini Soareset al.2021NeurIPS 2021
Model Agnostic Multilevel ExplanationsKarthikeyan Natesan RamamurthyBhanukiran Vinzamuriet al.2020NeurIPS 2020