Exploring the Impact of Riemannian Metrics in Human Action Recognition Tasks Using GyroSpd++

by HyperboleDecember 3rd, 2024

Too Long; Didn't Read

We evaluate GyroSpd++ in human action recognition using three datasets (HDM05, FPHA, NTU60), reporting on performance, convolutional layer design, and optimization. Ablation studies and comparisons with state-of-the-art methods highlight its advantages and challenges.

featured image - Exploring the Impact of Riemannian Metrics in Human Action Recognition Tasks Using GyroSpd++

‘neural network’ Image created by HackerNoon AI Image Generator

Table of Links

Abstract and 1. Introduction

Preliminaries
Proposed Approach

3.1 Notation

3.2 Nueral Networks on SPD Manifolds

3.3 MLR in Structure Spaces

3.4 Neural Networks on Grassmann Manifolds
Experiments
Conclusion and References

A. Notations

B. MLR in Structure Spaces

C. Formulation of MLR from the Perspective of Distances to Hyperplanes

D. Human Action Recognition

E. Node Classification

F. Limitations of our work

G. Some Related Definitions

H. Computation of Canonical Representation

I. Proof of Proposition 3.2

J. Proof of Proposition 3.4

K. Proof of Proposition 3.5

L. Proof of Proposition 3.6

M. Proof of Proposition 3.11

N. Proof of Proposition 3.12

C FORMULATION OF MLR FROM THE PERSPECTIVE OF DISTANCES TO HYPERPLANES

D HUMAN ACTION RECOGNITION

D.1 DATASETS

HDM05 (Muller et al., 2007) It has 2337 sequences of 3D skeleton data classified into 130 classes. Each frame contains the 3D coordinates of 31 body joints. We use all the action classes and follow the experimental protocol in Harandi et al. (2018) in which 2 subjects are used for training and the remaining 3 subjects are used for testing.

FPHA (Garcia-Hernando et al., 2018) It has 1175 sequences of 3D skeleton data classified into 45 classes. Each frame contains the 3D coordinates of 21 hand joints. We follow the experimental protocol in Garcia-Hernando et al. (2018) in which 600 sequences are used for training and 575 sequences are used for testing.

NTU60 (Shahroudy et al., 2016) It has 56880 sequences of 3D skeleton data classified into 60 classes. Each frame contains the 3D coordinates of 25 or 50 body joints. We use the mutual actions and follow the cross-subject experimental protocol in Shahroudy et al. (2016) in which data from 20 subjects are used for training, and those from the other 20 subjects are used for testing.

D.2 IMPLEMENTATION DETAILS

D.2.1 SETUP

D.2.2 INPUT DATA

For SPDNet and SPDNetBN, each sequence is represented by a covariance matrix (Huang & Gool, 2017; Brooks et al., 2019). The sizes of the covariance matrices are 93×93, 60×60, and 150×150 for HDM05, FPHA, and NTU60 datasets, respectively. For SPDNet, the same architecture as the one in Huang & Gool (2017) is used with three Bimap layers. For SPDNetBN, the same architecture as the one in Brooks et al. (2019) is used with three Bimap layers. The sizes of the transformation matrices for the experiments on HDM05, FPHA, and NTU60 datasets are set to 93 × 93, 60 × 60, and 150 × 150, respectively

D.2.3 CONVOLUTIONAL LAYERS

D.2.4 OPTIMIZATION

For parameters that are SPD matrices, we model them on the space of symmetric matrices, and then apply the exponential map at the identity.

Thus, we can optimize all parameters on Euclidean spaces without having to resort to techniques developed on Riemannian manifolds.

D.3 TIME COMPLEXITY ANALYSIS

D.4 MORE EXPERIMENTAL RESULTS

D.4.1 ABLATION STUDY

Tab. 4 reports the mean accuracies and standard deviations of GyroSpd++ with respect to different settings of β on the three datasets. GyroSpd++ with the setting β = 0 generally works well on all the datasets. Setting k = 3 improves the accuracy of GyroSpd++ on NTU60 dataset. We also observe that setting k to a high value, e.g., k = 10 lowers the accuracies of GyroSpd++ on the datasets.

Output dimension of convolutional layers Tab. 5 presents results and computation times of GyroSpd++ with respect to different settings of the output dimension of the convolutional layer on FPHA dataset. Results show that the setting m = 21 clearly outperforms the setting m = 10 in terms of mean accuracy and standard deviation. However, compared to the setting m = 21, the setting m = 30 only increases the training and testing times without improving the mean accuracy of GyroSpd++.

Design of Riemannian metrics for network blocks The use of different Riemannian metrics for the convolutional and MLR layers of GyroSpd++ results in different variants of the same architecture. Results of some of these variants on FPHA dataset are shown in Tab. 6. It is noted that our architecture gives the best performance in terms of mean accuracy, while the architecture with Log-Cholesky geometry for the MLR layer performs the worst in terms of mean accuracy.

D.4.2 COMPARISON OF GYROSPD++ AGAINST STATE-OF-THE-ART METHODS

Finally, we present a comparison of computation times of SPD neural networks in Tab. 10.

Authors:

(1) Xuan Son Nguyen, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]);

(2) Shuo Yang, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]);

(3) Aymeric Histace, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]).

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

[3] https://github.com/dalab/hyperbolic_nn.

[4] https://github.com/kenziyuliu/MS-G3D.

[5] https://github.com/zhysora/FR-Head.

[6] https://github.com/Chiaraplizz/ST-TR.

L O A D I N G
. . . comments & more!

About Author

Hyperbole@hyperbole

Amplifying words and ideas to separate the ordinary from the extraordinary, making the mundane majestic.

Read my stories About @hyperbole

TOPICS

machine-learning #deep-neural-networks #riemannian-manifolds #spd-manifolds #graph-convolutional-networks #spdnet #manifold-neural-networks #logistic-regression #euclidean-neural-networks

THIS ARTICLE WAS FEATURED IN...

Join HackerNoon

Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas

Exploring the Impact of Riemannian Metrics in Human Action Recognition Tasks Using GyroSpd++

Too Long; Didn't Read

Table of Links

C FORMULATION OF MLR FROM THE PERSPECTIVE OF DISTANCES TO HYPERPLANES

D HUMAN ACTION RECOGNITION

D.1 DATASETS

D.2 IMPLEMENTATION DETAILS

D.3 TIME COMPLEXITY ANALYSIS

D.4 MORE EXPERIMENTAL RESULTS

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES