SEARCHD - Advanced Retrieval with Text Generation using Large Language Models and Cross Encoding Re-ranking

Pradumn Mishra; Aditya Mahakali; V. Prasanna Shrinivas

doi:10.1109/CASE59546.2024.10711642

CASE 2024

Conference paper

23 Oct 2024

SEARCHD - Advanced Retrieval with Text Generation using Large Language Models and Cross Encoding Re-ranking

View publication

Abstract

This study demonstrates how documents created by LLMs can be useful for retrieving information from other documents. We introduce our simple and effective dense retrieval framework, Search Engine with Advanced Retrieval using Cross-encoding and Hypothetical Documents (SEARCHD) which enhances the existing information retrieval mechanism and reduces the latency of LLM-based retrievers. This framework generates a partially correct document using a LLM which is clubbed along with the original query for context retrieval. The initial context which has a lower context precision is re-ranked by cross encoding and lower-ranked documents are eliminated based on a set threshold depending on the use case. This framework outperforms LLM-based retrievers such as HyDE in both accuracy and latency and re-ranking-based retrievers like RAG Fusion in accuracy on the MS-Marco Question-Answering Dataset with a significant enhancement of 12%.

Conference paper