Exploring Straightforward Methods for Automatic Conversational Red-TeamingGeorge KourNaama Zwerdlinget al.2025NAACL 2025
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You InItay NakashGeorge Kouret al.2025NAACL 2025
ASTER: Natural and Multi-language Unit Test Generation with LLMsRangeet PanMyeongsoo Kimet al.2025ICSE 2025
Workshop on Neuro-Symbolic Software EngineeringChristian Medeiros AdrianoSona Ghahremaniet al.2025ICSE 2025
Combinatorial Test Design Model Creation using Large Language ModelsDebbie FurmanEitan Farchiet al.2025IWCT 2025
Evolution of catalysis at IBM: From microelectronics to biomedicine to sustainability with AI-driven innovationJames HedrickTim Erdmannet al.2025ACS Spring 2025
Workshop on Data Integrity and Secure Cloud Computing (DISCC)Pradip BoseAugusto Vegaet al.2025HPCA 2025
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models BenchmarkingGabriel RiouxApoorva Nitsureet al.2024NeurIPS 2024
A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial ScenariosSamuel AckermanElla Rabinovichet al.2024EMNLP 2024