This story draft by @hyperbole has not been reviewed by an editor, YET.
Proposed Approach
C. Formulation of MLR from the Perspective of Distances to Hyperplanes
H. Computation of Canonical Representation
HDM05 (Muller et al., 2007) It has 2337 sequences of 3D skeleton data classified into 130 classes. Each frame contains the 3D coordinates of 31 body joints. We use all the action classes and follow the experimental protocol in Harandi et al. (2018) in which 2 subjects are used for training and the remaining 3 subjects are used for testing.
FPHA (Garcia-Hernando et al., 2018) It has 1175 sequences of 3D skeleton data classified into 45 classes. Each frame contains the 3D coordinates of 21 hand joints. We follow the experimental protocol in Garcia-Hernando et al. (2018) in which 600 sequences are used for training and 575 sequences are used for testing.
NTU60 (Shahroudy et al., 2016) It has 56880 sequences of 3D skeleton data classified into 60 classes. Each frame contains the 3D coordinates of 25 or 50 body joints. We use the mutual actions and follow the cross-subject experimental protocol in Shahroudy et al. (2016) in which data from 20 subjects are used for training, and those from the other 20 subjects are used for testing.
D.2.1 SETUP
D.2.2 INPUT DATA
For SPDNet and SPDNetBN, each sequence is represented by a covariance matrix (Huang & Gool, 2017; Brooks et al., 2019). The sizes of the covariance matrices are 93×93, 60×60, and 150×150 for HDM05, FPHA, and NTU60 datasets, respectively. For SPDNet, the same architecture as the one in Huang & Gool (2017) is used with three Bimap layers. For SPDNetBN, the same architecture as the one in Brooks et al. (2019) is used with three Bimap layers. The sizes of the transformation matrices for the experiments on HDM05, FPHA, and NTU60 datasets are set to 93 × 93, 60 × 60, and 150 × 150, respectively
D.2.3 CONVOLUTIONAL LAYERS
D.2.4 OPTIMIZATION
For parameters that are SPD matrices, we model them on the space of symmetric matrices, and then apply the exponential map at the identity.
Thus, we can optimize all parameters on Euclidean spaces without having to resort to techniques developed on Riemannian manifolds.
D.4.1 ABLATION STUDY
Tab. 4 reports the mean accuracies and standard deviations of GyroSpd++ with respect to different settings of β on the three datasets. GyroSpd++ with the setting β = 0 generally works well on all the datasets. Setting k = 3 improves the accuracy of GyroSpd++ on NTU60 dataset. We also observe that setting k to a high value, e.g., k = 10 lowers the accuracies of GyroSpd++ on the datasets.
Output dimension of convolutional layers Tab. 5 presents results and computation times of GyroSpd++ with respect to different settings of the output dimension of the convolutional layer on FPHA dataset. Results show that the setting m = 21 clearly outperforms the setting m = 10 in terms of mean accuracy and standard deviation. However, compared to the setting m = 21, the setting m = 30 only increases the training and testing times without improving the mean accuracy of GyroSpd++.
Design of Riemannian metrics for network blocks The use of different Riemannian metrics for the convolutional and MLR layers of GyroSpd++ results in different variants of the same architecture. Results of some of these variants on FPHA dataset are shown in Tab. 6. It is noted that our architecture gives the best performance in terms of mean accuracy, while the architecture with Log-Cholesky geometry for the MLR layer performs the worst in terms of mean accuracy.
D.4.2 COMPARISON OF GYROSPD++ AGAINST STATE-OF-THE-ART METHODS
Finally, we present a comparison of computation times of SPD neural networks in Tab. 10.
Authors:
(1) Xuan Son Nguyen, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]);
(2) Shuo Yang, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]);
(3) Aymeric Histace, ETIS, UMR 8051, CY Cergy Paris University, ENSEA, CNRS, France ([email protected]).
This paper is
[3] https://github.com/dalab/hyperbolic_nn.
[4] https://github.com/kenziyuliu/MS-G3D.
[5] https://github.com/zhysora/FR-Head.
[6] https://github.com/Chiaraplizz/ST-TR.