Arka: Generalist Foundation Model on SDO Data
Abstract
Deep learning-based methods have been widely researched in the areas of language and vision, demonstrating their capacity to understand long sequences and their usefulness in numerous Earth science applications. Foundation models (FM), which are pre-trained on a large scale, form the basis for a variety of data processing tasks. These models, especially those based on transformers in vision and language, show exceptional potential for adapting to a wide range of downstream applications. Here we present a foundation model design approach and complexity for the Helio Foundation model based on the Solar Dynamics Observatory (SDO) dataset. SDO is NASA's flagship mission launched on Feb 11, 2010, and it's still collecting enormous amounts of data using multiple instruments. Atmospheric Imaging Assembly (AIA) is one of them focusing on the study of solar corona, the outermost layer of the solar atmosphere with a temperature reaching several million Kelvin. AIA images the solar atmosphere in multiple wavelengths continuously. Helioseismic and Magnetic Imager (HMI), the other onboard instrument, measures the solar surface (photosphere) magnetic field distributions on the visible disk of the sun, produces the full disk line of sight and vector magnetograms with 45 second and 12 minute cadence, respectively. We design an encoder decoder-based foundation model to learn over ten years of SDO dataset (2011-2020) from eight bands (seven AIA EUV and one HMI) with full resolution (4096x4096) and a temporal scale of 12 minutes. Future work includes validating the model for solar dynamics and downstream it over multiple space weather applications.