Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A
This work examines the critical component of schema linking in the Text-to-SQL domain, investigating the ability of Large Language Models (LLMs) to accurately identify relevant database tables and columns for SQL query generation from natural language requests. We conduct an in-depth analysis of LLM-based schema linking approaches, exploring the best practices to obtain high-quality predictions. Additionally, we experiment with multiple techniques, such as question decomposition, that have been successfully applied in Text-to-SQL and test their benefits to schema linking. We also challenge the prevailing assumption that the oracle linked schema, comprising the minimum set of columns necessary for SQL generation, is always the optimal schema representation. Our experiments on the Spider and BIRD benchmarks demonstrate the ability of LLMs to perform high-quality schema linking, boosting the overall Text-to-SQL performance.
Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Conrad Albrecht, Jannik Schneider, et al.
CVPR 2025
Miao Guo, Yong Tao Pei, et al.
WCITS 2011