paint-brush
Machine-Learning Neural Spatiotemporal Signal Processing with PyTorch Geometric Temporalby@benitorosenberg
537 reads
537 reads

Machine-Learning Neural Spatiotemporal Signal Processing with PyTorch Geometric Temporal

by Benedek RozemberczkiFebruary 19th, 2021
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

PyTorch Geometric Temporal is a deep learning library for neural spatiotemporal signal processing.

Company Mentioned

Mention Thumbnail
featured image - Machine-Learning Neural Spatiotemporal Signal Processing with PyTorch Geometric Temporal
Benedek Rozemberczki HackerNoon profile picture

PyTorch Geometric Temporal is a deep learning library for neural spatiotemporal signal processing. This library is an open-source project. It consists of various dynamic and temporal geometric deep learning, embedding, and Spatiotemporal regression methods from a variety of published research papers.

In addition, it comes with an easy-to-use dataset loader, an iterator for dynamic and temporal graphs, and gpu-support. It also comes with a number of benchmark datasets with temporal and dynamic graphs (you can also create your own datasets).

In the following, we will overview a case study where PyTorch Geometric Temporal can be used to solve a real-world relevant machine learning problem.

Learning from a Discrete Temporal Signal

We are using the Hungarian Chickenpox Cases dataset in this case study. We will train a regressor to predict the weekly cases reported by the counties using a recurrent graph convolutional network. First, we will load the dataset and create an appropriate spatio-temporal split.

from torch_geometric_temporal.data.dataset import ChickenpoxDatasetLoader
from torch_geometric_temporal.data.splitter import discrete_train_test_split

loader = ChickenpoxDatasetLoader()

dataset = loader.get_dataset()

train_dataset, test_dataset = discrete_train_test_split(dataset, train_ratio=0.2)

Next, we will define the recurrent graph neural network architecture used for solving the supervised task. The constructor defines a DCRNN layer and a feedforward layer. It is important to note that the final
non-linearity is not integrated into the recurrent graph convolutional operation. This design principle is used consistently and it was taken from PyTorch Geometric.

Because of this, we defined a ReLU non-linearity between the recurrent and linear layers manually. The final linear layer is not followed by a non-linearity as we solve a regression problem.

In the next steps, we will define the recurrent graph neural network architecture used for solving the supervised task.

import torch
import torch.nn.functional as F
from torch_geometric_temporal.nn.recurrent import DCRNN

class RecurrentGCN(torch.nn.Module):
    def __init__(self, node_features):
        super(RecurrentGCN, self).__init__()
        self.recurrent = DCRNN(node_features, 32, 1)
        self.linear = torch.nn.Linear(32, 1)

    def forward(self, x, edge_index, edge_weight):
        h = self.recurrent(x, edge_index, edge_weight)
        h = F.relu(h)
        h = self.linear(h)
        return h

Let us define a model (we have 4 node features) and train the model on
the training split (first 20% of the temporal snapshots) for 200 epochs. We backpropagate when the loss from every snapshot is accumulated. We will use the Adam optimizer with a learning rate of 0.01. The tqdm function is used for measuring the runtime need for each training epoch.

from tqdm import tqdm

model = RecurrentGCN(node_features = 4)

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

model.train()

for epoch in tqdm(range(200)):
    cost = 0
    for time, snapshot in enumerate(train_dataset):
        y_hat = model(snapshot.x, snapshot.edge_index, snapshot.edge_attr)
        cost = cost + torch.mean((y_hat-snapshot.y)**2)
    cost = cost / (time+1)
    cost.backward()
    optimizer.step()
    optimizer.zero_grad()

Using the holdout we will evaluate the performance of the trained
recurrent graph convolutional network and calculate the mean squared
error across all of the spatial units and time periods.

model.eval()
cost = 0
for time, snapshot in enumerate(test_dataset):
    y_hat = model(snapshot.x, snapshot.edge_index, snapshot.edge_attr)
    cost = cost + torch.mean((y_hat-snapshot.y)**2)
cost = cost / (time+1)
cost = cost.item()
print("MSE: {:.4f}".format(cost))
>>> Accuracy: 0.6866

Previously published at https://pytorch-geometric-temporal.readthedocs.io/en/latest/notes/introduction.html