NAACL 2024

Harnessing the Power of LLMs to Vitalize Indigenous Languages


How can Large Language Models (LLMs) and modern NLP be used to increase the use and the documentation of Indigenous languages which are in danger of disappearing? First, I report on the development of high-quality translators for Indigenous languages by fine-tuning SOTA machine translators with tiny amounts of data, and discuss how to avoid some common pitfalls. Next, I present prototypes built with Indigenous communities aiming to stimulate and facilitate writing, using LLM models to create spell-checkers, next-word predictors, and similar tools. Finally, I discuss a future for documentation where dying languages are preserved as interactive language models.