paint-brush
HyperHuman Ablation Study: Optimal Expert Branch Design for Improved Image Generationby@homology

HyperHuman Ablation Study: Optimal Expert Branch Design for Improved Image Generation

by Homology Technology FTW
Homology Technology FTW HackerNoon profile picture

Homology Technology FTW

@homology

Unlocking shared blueprints with Homology, revealing evolutionary connections for a...

November 24th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The ablation study reveals that the best performance comes from jointly learning image appearance, spatial relationships, and geometry. Fewer replicate layers lead to better spatial alignment, while too many layers hinder feature fusion between different targets.
featured image - HyperHuman Ablation Study: Optimal Expert Branch Design for Improved Image Generation
1x
Read by Dr. One voice-avatar

Listen to this story

Homology Technology FTW HackerNoon profile picture
Homology Technology FTW

Homology Technology FTW

@homology

Unlocking shared blueprints with Homology, revealing evolutionary connections for a deeper understanding.

About @homology
LEARN MORE ABOUT @HOMOLOGY'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Authors:

(1) Xian Liu, Snap Inc., CUHK with Work done during an internship at Snap Inc.;

(2) Jian Ren, Snap Inc. with Corresponding author: jren@snapchat.com;

(3) Aliaksandr Siarohin, Snap Inc.;

(4) Ivan Skorokhodov, Snap Inc.;

(5) Yanyu Li, Snap Inc.;

(6) Dahua Lin, CUHK;

(7) Xihui Liu, HKU;

(8) Ziwei Liu, NTU;

(9) Sergey Tulyakov, Snap Inc.

Abstract and 1 Introduction

2 Related Work

3 Our Approach and 3.1 Preliminaries and Problem Setting

3.2 Latent Structural Diffusion Model

3.3 Structure-Guided Refiner

4 Human Verse Dataset

5 Experiments

5.1 Main Results

5.2 Ablation Study

6 Discussion and References

A Appendix and A.1 Additional Quantitative Results

A.2 More Implementation Details and A.3 More Ablation Study Results

A.4 More User Study Details

A.5 Impact of Random Seed and Model Robustness and A.6 Boarder Impact and Ethical Consideration

A.7 More Comparison Results and A.8 Additional Qualitative Results

A.9 Licenses

5.2 ABLATION STUDY

image


Simultaneous Denoise with Expert Branch. We explore whether latent structural diffusion model helps, and how many layers to replicate in the structural expert branches: 1) Denoise RGB, that only learns to denoise an image. 2) Denoise RGB + Depth, which also predicts depth. 3) Half DownBlock & UpBlock. We replicate half of the first DownBlock and the last UpBlock, which contains one down/up-sample ResBlock and one AttnBlock. 4) Two DownBlocks & UpBlocks, where we copy the first two DownBlocks and the last two UpBlocks. The results are shown in Tab. 2 (top), which prove that the joint learning of image appearance, spatial relationship, and geometry is beneficial. We also find that while fewer replicate layers give more spatially aligned results, the per-branch parameters are insufficient to capture distributions of each modality. In contrast, excessive replicate layers lead to less feature fusion across different targets, which fails to complement to each other branches.


image


This paper is available on arxiv under CC BY 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

Homology Technology FTW HackerNoon profile picture
Homology Technology FTW@homology
Unlocking shared blueprints with Homology, revealing evolutionary connections for a deeper understanding.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
Hackernoon
X
Threads
Bsky
X REMOVE AD