Prompt Templates: A Methodology for Improving Manual Red Teaming PerformanceBrandon DominiqueDavid Piorkowskiet al.2024CHI 2024
Facilitating Human-LLM Collaboration through Factuality Scores and Source AttributionsHyo Jin DoRachel Ostrandet al.2024CHI 2024
FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMsSwanand Ravindra KadheAnisa Halimiet al.2023NeurIPS 2023
Cost-Aware Counterfactuals for Black Box ExplanationsNatalia Martinez GilKanthi Sarpatwaret al.2023NeurIPS 2023
Influence Based Approaches to Algorithmic Fairness: A Closer LookSoumya GhoshPrasanna Sattigeriet al.2023NeurIPS 2023
Weakly Supervised Detection of Hallucinations in LLM ActivationsMiriam RateikeCelia Cintaset al.2023NeurIPS 2023
Subtle Misogyny Detection and Mitigation: An Expert-Annotated DatasetAnna RichterBrooklyn Sheppardet al.2023NeurIPS 2023
Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document LevelMujahid Ali QuidwaiChunhui Liet al.2023ACL 2023
Benchmarking the Effect of Poisoning Defenses on the Security and Bias of Deep Learning ModelsNathalie Baracaldo AngelFarhan Ahmedet al.2023S&P 2023
Connecting Underrepresented Minorities and Qualified Job Positions Using Online DataMaysa Malfiza Garcia de MacedoMarisa Affonso Vasconceloset al.2021AAAI 2021