This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Andrea Roncoli, Department of Computer, Science (University of Pisa);
(2) Aleksandra Ciprijanovi´c´, Computational Science and AI Directorate (Fermi National Accelerator Laboratory) and Department of Astronomy and Astrophysics (University of Chicago);
(3) Maggie Voetberg, Computational Science and AI Directorate, (Fermi National Accelerator Laboratory);
(4) Francisco Villaescusa-Navarro, Center for Computational Astrophysics (Flatiron Institute);
(5) Brian Nord, Computational Science and AI Directorate, Fermi National Accelerator Laboratory, Department of Astronomy and Astrophysics (University of Chicago) and Kavli Institute for Cosmological Physics (University of Chicago).
Acknowledgments and Disclosure of Funding, and References
Accurate determination of cosmological parameters using big data from astronomical surveys is a task of paramount importance in modern science. Historically, the extraction of valuable cosmological information has relied on computing summary statistics [31, 16, 15]. More recently, deep learning methods, such as 2D and 3D Convolutional Neural Networks (CNNs), showed great promise in extracting rich non-linear information that summary statistics struggle to capture [32, 39, 29]. However, CNNs lack scale-invariance, as their analysis is firmly anchored to the grid size of the convolutional kernels, while any information on scales below that is lost. Choosing a superfine grid to avoid information loss, though, would simply yield almost entirely zeros in case of sparse and irregular data, such as galaxy clusterings. Thus, CNNs result in an inadequate method for structured sparse data. In contrast, Graph Neural Networks (GNNs) [23, 4, 49, 46] can handle structured cosmic web data in a scale-free manner [41, 14]. As with any other model, the typical procedure is to train GNNs on labeled data (like simulations) and then infer cosmological parameters from unlabeled data (like observations). However, there is a significant risk of these models not generalizing in the presence of the domain shift between simulations and observations. Systematic biases have been demonstrated even in experiments that train and test on simulations with different subgrid physics [41]. Domain adaptation (DA) techniques [11, 43, 18, 27] can be used to increase model robustness to this type of domain shift. Here we propose the use of Domain Adaptive Graph Neural Networks (DA-GNNs) and investigate the utility of distance-based DA losses i.e., Maximum Mean Discrepancy (MMD) [6]. MMD is an unsupervised DA technique because it does not require labeled data, which is paramount for future applications on observations. We show that our domain-adaptive models achieve stronger generalization across datasets than regular GNN models. Our work is a significant step towards building future models trained on simulations, yet robust enough to work on observational data.
Related Work GNNs have shown great potential for extracting information from large sparse datasets, such as the distribution of galaxies, galaxy clusters, and cosmic large-scale structures [25, 28, 41, 33, 42, 14]. Unfortunately, due to the complexity of most deep learning models, they often learn dataset-specific features, which renders them useless when testing on a different dataset (different simulations or astronomical observations). In astronomy, it has been shown that DA techniques applied to different types of CNNs can substantially improve model performance in cross-dataset applications [8, 10, 9, 37, 21, 2]. Recently, it has been shown that DA can be used on other types of deep learning algorithms such as GNNs [12, 24, 45, 47, 7, 44, 17]. However, DA on GNNs has not been used for any astrophysics or cosmology applications.