paint-brush
The Limitations of GyroSpd++ and Gr-GCN++ in Human Action Recognition and Graph Embedding Tasksby@hyperbole

The Limitations of GyroSpd++ and Gr-GCN++ in Human Action Recognition and Graph Embedding Tasks

by HyperboleDecember 3rd, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

GyroSpd++ and Gr-GCN++ face limitations in their hybrid methods for SPD and Grassmann networks. The use of different Riemannian metrics and aggregation in tangent spaces may hinder optimal performance, highlighting the need for further refinement and unified approaches across all operations.
featured image - The Limitations of GyroSpd++ and Gr-GCN++ in Human Action Recognition and Graph Embedding Tasks
Hyperbole HackerNoon profile picture

Abstract and 1. Introduction

  1. Preliminaries

  2. Proposed Approach

    3.1 Notation

    3.2 Nueral Networks on SPD Manifolds

    3.3 MLR in Structure Spaces

    3.4 Neural Networks on Grassmann Manifolds

  3. Experiments

  4. Conclusion and References

A. Notations

B. MLR in Structure Spaces

C. Formulation of MLR from the Perspective of Distances to Hyperplanes

D. Human Action Recognition

E. Node Classification

F. Limitations of our work

G. Some Related Definitions

H. Computation of Canonical Representation

I. Proof of Proposition 3.2

J. Proof of Proposition 3.4

K. Proof of Proposition 3.5

L. Proof of Proposition 3.6

M. Proof of Proposition 3.11

N. Proof of Proposition 3.12

F LIMITATIONS OF OUR WORK

Our SPD network GyroSpd++ relies on different Riemannian metrics across the layers, i.e., the convolutional layer is based on Affine-Invariant metrics while the MLR layer is based on LogEuclidean metrics. Although we have provided the experimental results demonstrating that GyroSpd++ achieves good performance on all the datasets compared to state-of-the-art methods, it is not clear if our design is optimal for the human action recognition task. When it comes to building a deep SPD architecture, it is useful to provide insights into Riemannian metrics one should use for each network block in order to obtain good performance on a target task.


In our Grassmann network Gr-GCN++, the feature transformation and bias and nonlinearity operations are performed on Grassmann manifolds, while the aggregation operation is performed in tangent spaces. Previous works (Dai et al., 2021; Chen et al., 2022) on HNNs have shown that this hybrid method limits the modeling ability of networks. Therefore, it is desirable to develop GCNs where all the operations are formalized on Grassmann manifolds.


Table 14: Results (mean accuracy ± standard deviation) of Gr-GCN++ and some state-of-the-art methods on the three datasets. The best and second best results in terms of mean accuracy are highlighted in red and blue, respectively.


Authors:

(1) Xuan Son Nguyen, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]);

(2) Shuo Yang, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]);

(3) Aymeric Histace, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]).


This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.