Connecting the Dots: A Machine Learning Ready Dataset for Ionospheric Forecasting Models
Wolniewicz, Kelebek, Mestici et al.
Operational forecasting of the ionosphere remains a critical space weather challenge due to sparse observations, complex coupling across geospatial layers, and a growing need for timely, accurate predictions that support Global Navigation Satellite System (GNSS), communications, aviation safety, as well as satellite operations. As part of the 2025 NASA Heliolab, we present a curated, open-access dataset that integrates diverse ionospheric and heliospheric measurements into a coherent, machine learning-ready structure, designed specifically to support next-generation forecasting models and address gaps in current operational frameworks. Our workflow integrates a large selection of data sources comprising Solar Dynamic Observatory data, solar irradiance indices (F10.7), solar wind parameters (velocity and interplanetary magnetic field), geomagnetic activity indices (Kp, AE, SYM-H), and NASA JPL's Global Ionospheric Maps of Total Electron Content (GIM-TEC). We also implement geospatially sparse data such as the TEC derived from the World-Wide GNSS Receiver Network and crowdsourced Android smartphone measurements. This novel heterogeneous dataset is temporally and spatially aligned into a single, modular data structure that supports both physical and data-driven modeling. Leveraging this dataset, we train and benchmark several spatiotemporal machine learning architectures for forecasting vertical TEC under both quiet and geomagnetically active conditions. This work presents an extensive dataset and modeling pipeline that enables exploration of not only ionospheric dynamics but also broader Sun-Earth interactions, supporting both scientific inquiry and operational forecasting efforts.
academic
Connecting the Dots: A Machine Learning Ready Dataset for Ionospheric Forecasting Models
标题: Connecting the Dots: A Machine Learning Ready Dataset for Ionospheric Forecasting Models
作者: Linnea M. Wolniewicz, Halil S. Kelebek, Simone Mestici, Michael D. Vergalla, Giacomo Acciarini, Bala Poduval, Olga Verkhoglyadova, Madhulika Guhathakurta, Thomas E. Berger, Atılım Güneş Baydin, Frank Soboczenski
机构: University of Hawai'i at Mānoa, University of Oxford, Università degli Studi di Roma Sapienza, Free Flight Research Lab, ESA, University of New Hampshire, NASA JPL, NASA Headquarters, University of Colorado Boulder, University of York & King's College London
发表时间/会议: NeurIPS 2025 Workshop: Machine Learning for the Physical Sciences
电离层的业务预报是空间天气领域的关键挑战,主要困难来自稀疏的观测数据、跨地理空间层的复杂耦合,以及对支持全球导航卫星系统(GNSS)、通信、航空安全和卫星运营的及时准确预测的日益增长的需求。作为2025 NASA Heliolab项目的一部分,本文提出了一个精心策划的开放访问数据集,将多样化的电离层和日球层测量数据整合成一个连贯的、机器学习就绪的结构。该数据集整合了太阳动力学观测站(SDO)数据、太阳辐照指数(F10.7)、太阳风参数(速度和行星际磁场)、地磁活动指数(Kp、AE、SYM-H)以及NASA JPL的全球电离层总电子含量图(GIM-TEC)等多种数据源。研究团队训练并基准测试了多个时空机器学习架构,用于在安静和地磁活跃条件下预测垂直TEC,为科学研究和业务预报提供支持。