Authors:
(1) Mohamed A. Abba, Department of Statistics, North Carolina State University;
(2) Brian J. Reich, Department of Statistics, North Carolina State University;
(3) Reetam Majumder, Southeast Climate Adaptation Science Center, North Carolina State University;
(4) Brandon Feng, Department of Statistics, North Carolina State University.
Table of Links
1.1 Methods to handle large spatial datasets
1.2 Review of stochastic gradient methods
2 Matern Gaussian Process Model and its Approximations
3 The SG-MCMC Algorithm and 3.1 SG Langevin Dynamics
3.2 Derivation of gradients and Fisher information for SGRLD
4 Simulation Study and 4.1 Data generation
4.2 Competing methods and metrics
5 Analysis of Global Ocean Temperature Data
6 Discussion, Acknowledgements, and References
Appendix A.1: Computational Details
Appendix A.2: Additional Results
2.1 The Vecchia approximation
For any set of spatial locations, the joint distribution of Y can be written as a product of univariate conditional distributions, which can then be approximated by a Vecchia approximation (Vecchia, 1988; Stein et al., 2004; Datta et al., 2016; Katzfuss and Guinness, 2021):
Let p(β, θ) be the prior distribution on the regression and covariance parameters. Using (5) we can write the posterior as (ignoring a constant that does not depend on the parameters)
Hence the log-likelihood and log-posterior of the parameters {β, θ} can be written as a sum of conditional normal log-densities, where the conditioning set is at most of size m. The cost of computing the log-posterior in (6) is linear in n and cubic in m.
Using (8), we can construct an unbiased estimate of the gradient of the Vecchia log-posterior based on a minibatch of the data:
This paper is available on arxiv under CC BY 4.0 DEED license.