Weichao Mao, Haoran Qiu, et al.
NeurIPS 2023
Recurrent Neural Networks (RNNs) such as Long Short -Term Memory (LSTM) have achieved significant success in hydrological modeling. However, the recent successes of foundation models like ChatGPT and Segment Anything Model (SAM) in natural language processing and computer vision have raised curiosity about the potential of Attention mechanism-based models in the hydrologic domain. In this study, we propose a deep learning framework that seamlessly integrates multi -source, multi -scale data and, multi -model modules, providing a flexible automated platform for multi -dataset benchmarking and attention-based model comparisons beyond LSTM-centered tasks. Furthermore, we evaluate pretrained Large Language Models (LLMs) and Time Series Attention-based Models (TSAMs) in terms of their forecasting capabilities in data sparse regions. This general framework can be applied to regression tasks, autoregression tasks, and zero- shot forecasting tasks (i.e., tasks without prior training data). We evaluated 11 different Transformer models under different scenarios in comparison to benchmark models, particularly LSTM, using datasets for runoff, soil moisture, snow water equivalent, and dissolved oxygen on global and regional scales. Results show that LSTM models perform the best in memory-dependent regression tasks, especially on the global streamflow dataset. However, as tasks become complex (from regression and data integration to autoregression and zero-shot prediction), attention-based models gradually surpass LSTM models. This study provides a robust framework for comparing and developing different model structures in the era of large-scale models, providing a valuable reference and benchmark for water resource modeling, forecasting and management.
Weichao Mao, Haoran Qiu, et al.
NeurIPS 2023
Yidi Wu, Thomas Bohnstingl, et al.
ICML 2025
Gosia Lazuka, Andreea Simona Anghel, et al.
SC 2024
Jiaqi Han, Wenbing Huang, et al.
NeurIPS 2022