MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation SystemsYannis KatsisSara Rosenthalet al.2025ACL 2025
InspectorRAGet: An Introspection Platform for RAG EvaluationBenjamin SznajderKshitij Fadniset al.2025NAACL 2025
Creating Conversational Datasets for Retrieval-Augmented Generation Applications is Hard: Challenges & Research OpportunitiesMaeda HanafiKshitij Fadniset al.2025CHI 2025
Machine-Assisted Error Discovery in Conversational AI SystemsMaeda HanafiFrederick Reisset al.2024CHI 2024
Zero-shot Topical Text Classification with LLMs - an Experimental StudyAvishai GretzAlon Halfonet al.2023EMNLP 2023
Label Sleuth: From Unlabeled Text to a Classifier in a Few HoursEyal ShnarchAlon Halfonet al.2022EMNLP 2022
Knowledge-augmented Risk Assessment (KaRA): a hybrid-intelligence framework for supporting knowledge-intensive risk assessment of prospect candidatesMaeda HanafiYannis Katsiset al.2022EMNLP 2022
InteractEva: A Simulation-Based Evaluation Framework for Interactive AI SystemsYannis KatsisMaeda Hanafiet al.2022AAAI 2022
Abstractified Multi-instance Learning (AMIL) for Biomedical Relation ExtractionNone NoneNone Noneet al.2021AKBC 2021
Development of an Enterprise-Grade Contract Understanding SystemArvind AgarwalLaura Chiticariuet al.2021NAACL 2021