A Programming Model for Geospatial Machine-Learning with Scalability in Hybrid Multiclouds
While deep machine learning approaches are getting pervasively used in remote sensing and modeling the earth, difficulties due to the size of satellite data are always pains for scientists in implementing such experiential software. We present a programming model for geospatial machine learning based on TorchGeo and PyTorch, which are getting the de fact standards in programming with PyTorch/Python. TorchGeo is open-sourced and designed to make it simple for remote sensing experts to explore machine learning solutions. Our objective is to allow machine-learning programs using TorchGeo to scale leveraging proprietary high-performance computing (HPC) and multicloud HPC resources, from ones notebook. One of key technologies specifically needed in geospatial machine learning is the smart integration of peta-scale data services and data-distributed parallel frameworks. We implement such a platform as a part of IBM Research Geospatial Discovery Network (GDN) and experiment segmentation tasks such as flood detection from satellite data to show its scalability.