Poster

State-Based Multi-Agent LLM Frameworks for Earth Observation: Scalable Tools for Geospatial Understanding

Abstract

Advancements in large language models (LLMs) have enabled the development of agentic systems that autonomously solve complex tasks by interacting with tools such as APIs, databases, and machine learning models. In this work, we design and implement an agentic workflow tailored for Earth Observation (EO) tasks, where a key challenge lies in integrating diverse geospatial tools into a cohesive environment that can be effectively used by an LLM agent. To address this, we first construct a foundational tool ecosystem comprising EO datasets, spatiotemporal query functions, and geospatial model inference utilities. While single-agent systems offer simplicity, they often become bottlenecks in high-scale spatiotemporal settings. We overcome this by adopting a multi-agent architecture, wherein an orchestrator delegates subtasks to specialized sub-agents for querying, modeling, and visualization. A major bottleneck in this setup is the orchestrator's reasoning load, especially when using open-source LLMs (OLMs), which may lack the coherence of proprietary models like GPT-4o. To mitigate this, we introduce state-based prompting, modeling natural language queries as finite state machines (FSMs) with explicit task states (e.g., data filtering, model inference, map plotting) and transitions. This structured prompting reduces ambiguity and enhances agent performance. We further implement an error state with self-reflection to enable recovery from failures, and a validation state to encourage task-level reasoning and summarization. Our agentic system successfully performs EO tasks such as object detection on FAIR1M, xView datasets and Land Cover Classification on BigEarthNet dataset and flood mapping and damage assessment with reasoning using Sentinel-2 imagery from Google Earth Engine and flood segmentation models such as IBM's Prithvi. Preliminary results demonstrate strong alignment with state-of-the-art benchmarks while achieving low-latency, cost-effective inference using open models. This framework paves the way for scalable, interpretable, and modular EO analysis using LLM-powered agents.