Recent large success with foundation models (FM) in natural language processing gets attracting earth monitoring scientists to apply the idea to remote sensing data-based scientific studies, such as climate impact modeling. FMs are large artificial intelligence model trained on a vast quantity of data at scale, usually by self-supervised learning, to be adapted to variety of down-stream tasks. In addition to standardized benchmarks for measuring performance in down-stream tasks, the support for easy-to-setup and scalable training of FM models is a key to stimulate the development of FMs. We propose an orchestration service with reproducible benchmark experimentation of custom FM models and scalable model training. The provided framework allows any Python code written in a distributed-data-parallel manner to run in an arbitrary scale, from a single notebook for debugging to a cluster of massive GPU servers from arbitrary cloud or on-premise for full run, with a large dataset service coordinated to work with the framework. We have made some down-stream tasks, including precipitation observation interpolation, flood-mapping, and super-resolution, readily available for anyone to challenge, while providing reference state-of-the-art machine-learning solution implementations for the tasks, adapted to run in a distributed-data-parallel manner. We compare their performance with different pre-trained artificial neural networks and different transfer-learning methods. These down-stream tasks, scalable state-of-the-art solution implementations, and foundation models training implementations are to be open-sourced for any scientist/engineer to reproduce and customize the experiments we conducted.