Generating Consistent Full-Body Avatars: Stratified Motion Diffusion for Decoupled Kinematics

Table of Links

Related Work

2.1. Motion Reconstruction from Sparse Input

2.2. Human Motion Generation
SAGE: Stratified Avatar Generation and 3.1. Problem Statement and Notation

3.2. Disentangled Motion Representation

3.3. Stratified Motion Diffusion

3.4. Implementation Details
Experiments and Evaluation Metrics

4.1. Dataset and Evaluation Metrics

4.2. Quantitative and Qualitative Results

4.3. Ablation Study
Conclusion and References

Supplementary Material

3.3. Stratified Motion Diffusion

After encoding and expressing different human motions as latents, we aim to properly sample from the latent space for full-body motion reconstructions and match the sparse observations.

Although disentangling the full-body motions into upper and lower parts enhances effectiveness and efficiency for motion representation learning, it’s crucial to include the correlation between two body parts during generation. Otherwise, severe inconsistency would be witnessed in reconstructed full-body motions. To this end, we propose Stratified Motion Diffusion to sample upper-body and lowerbody latent in a cascaded manner with explicit considerations of the correlations mentioned above.

Authors:

(1) Han Feng, equal contributions, ordered by alphabet from Wuhan University;

(2) Wenchao Ma, equal contributions, ordered by alphabet from Pennsylvania State University;

(3) Quankai Gao, University of Southern California;

(4) Xianwei Zheng, Wuhan University;

(5) Nan Xue, Ant Group ([email protected]);

(6) Huijuan Xu, Pennsylvania State University.

This paper is available on arxiv under CC BY 4.0 DEED license.