OGC 'apps to the data' architecture successfully applied: The Earth System Grid Federation (ESGF) Compute Challenge
The OGC Engineering Report OGC 19-003: Earth System Grid Federation (ESGF) Compute Challenge has now been published using the information gained from the Earth System Grid Federation extension to OGC Testbed-14, which successfully concluded in June, 2019.
The T-14 ESGF extension demonstrated impressively how the 'applications to the data' architecture, developed in Testbed-13 and Testbed-14, is applicable to climate simulation data processing and analysis. The architecture builds on Web Processing Service profiles and metadata models to allow application developers to make their applications available to the community via Docker containers.
The Earth System Grid Federation (ESGF) operates a globally leading climate simulation data warehouse and manages the software infrastructure for climate model and observational data analysis. The climate data warehouse hosts petabytes of data and thus requires pre-deployed applications that can be executed close to the physical location of the data to minimize data transfer, rather than downloading and local processing.
“Thanks to the Applications to the Data Architecture developed by OGC, we successfully managed to package, deploy, and execute our climate data analysis processes in the cloud. The Applications Deployment and Execution WPS profile helped us tremendously to advance the interoperability of climate data processing in the ESGF infrastructure,” Tom Landry from the Computer Research Institute of Montreal explained. “This work advances geospatial workflows towards standardization and even facilitates the use and exchange of machine learning systems for climate change analysis.”
Tom's team at CRIM analyzed CMIP5 and CMIP6 models results. CMIP6 is the latest version of the Coupled Model Intercomparison Project, a project overseen by the World Climate Research Programme (WCRP). WCRP facilitates analysis and prediction of Earth system change for use in a range of practical applications of direct relevance, benefit, and value to society. CMIP6 poses high interoperability challenges, with data currently being served by ESGF from 12 different institutions and an expected model output of 10-50 petabytes.
CRIM deployed three Web Processing Services in total, one at the Analytics, Informatics, and Management Systems (AIMS) project operated by the Lawrence Livermore National Laboratory (LLNL), a second one as part of NASA's Earth Data Analytics Service (EDAS), and a third one at CRIM's research platform PAVICS, which bundles data search, analytics, and visualization services. Additionally, the Applications Deployment and Execution WPS profile now provides climate analytics for the ClimateData.ca portal (stay tuned for an upcoming blog post on the portal).
All results are described in the recently published OGC Engineering Report OGC 19-003: Earth System Grid Federation (ESGF) Compute Challenge. OGC gives thanks to the United States Department of Energy - Biological and Environmental Research (US DoE BER) and the European Space Agency (ESA) for co-funding these activities. CRIM thanks Natural Resources Canada (NRCan), Environment and Climate Change Canada (ECCC), CANARIE and Ministère de l’Économie et de l’Innovation du Québec (MEI) for their support.