paint-brush
Deep Neural Network for Sea Surface Temperature Prediction: Backgroundby@oceanography
152 reads

Deep Neural Network for Sea Surface Temperature Prediction: Background

Too Long; Didn't Read

In this paper, researchers enhance SST prediction by transferring physical knowledge from historical observations to numerical models.
featured image - Deep Neural Network for Sea Surface Temperature Prediction: Background
Oceanography: Everything You Need to Study the Ocean HackerNoon profile picture

Authors:

(1) Yuxin Meng;

(2) Feng Gao;

(3) Eric Rigall;

(4) Ran Dong;

(5) Junyu Dong;

(6) Qian Du.

II. BACKGROUND

A. Generative Adversarial Network


In 2014, Goodfellow et al. [16] put forward a novel framework of generative model trained on an adversarial manner. In their method, a generative model G and a discriminative model D were trained simultaneously. The model G was applied to indirectly capture the distribution of the input data through model D and generate similar data. While model D estimates the probabilities that its input samples came from training data instead of model G. The training process of G was driven by the probability errors of D. In this adversarial process, G and D guide the learning and gradually strengthen each other’s ability to achieve outstanding performance.


GANs have been applied in physical-relevant tasks. For example, Yang et al. [17] applied physics-informed GANs to deal with high-dimensional problems and solved stochastic differential equations, Lutjens ¨ et al. [18] produced more realistic coastal flood data by using GANs to learn the features in the numerical model data, Zheng et al. [19] inferred the unknown spatial data with the potential physical law that is learned by GANs. However, these works performed their model by using GAN to replace the entire numerical model, which is quite different from our work. In this paper, we adopt GAN model to transfer the physical knowledge from the observed data to the numerical model data, in order to correct and improve the physical feature in the numerical model. In addition, existing methods only learn a deterministic model without considering whether the code generated by the encoder is in accordance with the semantic knowledge learned by the GAN.


B. Convolutional Long Short-Term Memory


In 2015, ConvLSTM [20] was proposed to solve the precipitation nowcasting. The network structure of ConvLSTM is able to capture local spatial features as in classical convolutional neural networks (CNN) [21] while building a sequential relationship, inherited from Long Short-Term Memory (LSTM) blocks. Moreover, the authors conducted experiments to show that ConvLSTM is able to perform better than LSTM on spatial-temporal relationship. Apart from weather prediction tasks, ConvLSTM can be applied to various spatial-temporal sequential prediction problems, for example, action recognition [22], [23].


C. Sea Surface Temperature Prediction


Lins et al. [24] investigated SST in tropical Atlantic using an SVM. Patil et al. [25] adopted an artificial neural network to predict the sea surface temperature. It performs well only in the case of forecasting with the lead time from 1 to 5 days and then the accuracy declined. Zhang et al. [26] applied LSTM to


Fig. 2. Illustration of the proposed SST prediction method. It consists of two stages: Prior network training and SST prediction with enhanced data. In the first stage, a prior network is trained to generate physics-enhanced SST. In the second stage, the physics-enhanced SST are used for SST prediction via ConvLSTM.


predict SST. Yang et al. [27] predicted SST by building a fully connected LSTM model. From another perspective, Patil et al. [28] used a wavelet neural network to predict daily SST, while Quala et al. [29] proposed patch-level neural network method for SST prediction. However, these methods only rely on data and ignore the physical knowledge behind them. Ham et al. [15] adopted transfer learning to predict ENSO and classify them. In this work, we conduct comparative experiments and the results point out that our method reduces the short-term errors as well as the long-term bias.


D. Data Augmentation


Shorten et al. [30] reviewed recent techniques of image data augmentation for deep learning. The purpose of data augmentation is to enhance the representation capability of neural networks and learn the distribution of original data better. In recent years, two kinds of data augmentation techniques have been commonly used: data transformation and resampling. The data transformation approach includes geometric transformation [31], color space transformation [32]–[34], random erasing [35]–[37], adversarial training [38]–[41] and style transfer [42]–[45]. The resampling technique lays particular emphasis on new instance composition, such as image mixup [46]–[48], feature space enhancement [49], [50] and generative adversarial network (GAN) [16]. Geometric transformation can acquire nice performance, such as image flip, crop, rotation, translation, and noise injection [51]. The experimental results in [30] showed that the random cropping technique performed well. Color space transformation suffers from a large memory consumption and long computing time. Random erasing techniques can improve the network robustness in occlusion cases by using masks. Although adversarial training can also improve robustness, the finite number of natural adversarial samples largely limits the network performance in practice. The neural style transfer approach is only effective for specific tasks, while its practical application is limited. The feature space augmentation implements the capability of interpolating representations in the feature space. GANbased augmentation techniques have been applied to achieve current state-of-the-art network performance [52]. However, there does not exist an effective data augmentation method that could exploit the merits of the numerical model and deep learning. In this paper, we aim to propose a novel data enhancement technique based on physical knowledge. The proposed technique achieves better performance than GANbased augmentation.


This paper is available on arxiv under CC 4.0 license.