Workshop: Trustworthy and Socially Responsible Machine Learning

Accelerating Open Science for AI in Heliophysics

Dolores Garcia · Paul Wright · Mark Cheung · Meng Jin · James Parr


Rarely are Artificial Intelligence (AI) projects packaged in a way where scientists and non-AI specialists can easily pick up advanced Machine Learning (ML) workflows. Similarly, AI engineers are not always able to contribute meaningfully to a science domain without being provided with useful application context or analysis-ready data. Because of this–and other factors–applied AI research often stalls at the research paper stage, where the often complex logistics of replicating and building on the work of others impedes substantive progress. A state of affairs has been identified by the community as ‘Reproducibility.’ (1,500 scientists lift the lid on reproducibility - Nature). Potential gains in AI are therefore hampered by the “expertise gap” between ML specialists and domain scientists.Moreover, the reputation of AI as a transformative tool for science is somewhat belated due to the lack of deployed, trusted solutions in the wild–as projects struggle to migrate from mid-TRL (Technical Readiness Level) to high TRL.Another key concept is that AI projects are never really finished. Improvements can be made in both the model choice (the selection of which improves annually) and training data–the latter often being the key actor in improving outcomes. Once built, workflows can easily grow to accommodate more data over time. In this paper we present the learnings for a study conducted to tackle findings informed by the 2021 SMD AI Workshop, showcasing best practice in the adoption of trusted and maintained open science in AI for Heliophysics and scaling lower TRL applications to higher TRLs. We also present an example of rapid derivative Heliophysics research conducted by a non-subject matter expert, showing the value of these kinds of open science approaches.

Chat is not available.