paint-brush
Hyper-Realistic Human Generation with Latent Structural Diffusion: Licensesby@homology

Hyper-Realistic Human Generation with Latent Structural Diffusion: Licenses

by Homology Technology FTW
Homology Technology FTW HackerNoon profile picture

Homology Technology FTW

@homology

Unlocking shared blueprints with Homology, revealing evolutionary connections for a...

November 25th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

HyperHuman introduces a novel method for generating realistic human images using a Latent Structural Diffusion Model, with strong performance across various datasets. It ensures robust results across random seeds, addresses ethical concerns, and follows open dataset licenses like CC-BY and CreativeML Open RAIL++-M.
featured image - Hyper-Realistic Human Generation with Latent Structural Diffusion: Licenses
1x
Read by Dr. One voice-avatar

Listen to this story

Homology Technology FTW HackerNoon profile picture
Homology Technology FTW

Homology Technology FTW

@homology

Unlocking shared blueprints with Homology, revealing evolutionary connections for a deeper understanding.

About @homology
LEARN MORE ABOUT @HOMOLOGY'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Authors:

(1) Xian Liu, Snap Inc., CUHK with Work done during an internship at Snap Inc.;

(2) Jian Ren, Snap Inc. with Corresponding author: jren@snapchat.com;

(3) Aliaksandr Siarohin, Snap Inc.;

(4) Ivan Skorokhodov, Snap Inc.;

(5) Yanyu Li, Snap Inc.;

(6) Dahua Lin, CUHK;

(7) Xihui Liu, HKU;

(8) Ziwei Liu, NTU;

(9) Sergey Tulyakov, Snap Inc.

Abstract and 1 Introduction

2 Related Work

3 Our Approach and 3.1 Preliminaries and Problem Setting

3.2 Latent Structural Diffusion Model

3.3 Structure-Guided Refiner

4 Human Verse Dataset

5 Experiments

5.1 Main Results

5.2 Ablation Study

6 Discussion and References

A Appendix and A.1 Additional Quantitative Results

A.2 More Implementation Details and A.3 More Ablation Study Results

A.4 More User Study Details

A.5 Impact of Random Seed and Model Robustness and A.6 Boarder Impact and Ethical Consideration

A.7 More Comparison Results and A.8 Additional Qualitative Results

A.9 Licenses

A.9 LICENSES

Image Datasets:


• LAION-5B**[**2] (Schuhmann et al., 2022): Creative Common CC-BY 4.0 license.


• COYO-700M**[**3] (Byeon et al., 2022): Creative Common CC-BY 4.0 license.


• MS-COCO**[**4] (Lin et al., 2014): Creative Commons Attribution 4.0 License.


Pretrained Models and Off-the-Shelf Annotation Tools:


• diffusers[5] (von Platen et al., 2022): Apache 2.0 License.


• CLIP[6] (Radford et al., 2021): MIT License.


• Stable Diffusion[7] (Rombach et al., 2022): CreativeML Open RAIL++-M License.


• YOLOS-Tiny[8] (Fang et al., 2021): Apache 2.0 License.


• BLIP2[9] (Guo et al., 2023): MIT License.


• MMPose[10] (Contributors, 2020): Apache 2.0 License.


• ViTPose[11] (Xu et al., 2022): Apache 2.0 License.


• Omnidata[12] (Eftekhar et al., 2021): OMNIDATA STARTER DATASET License


• MiDaS[13] (Ranftl et al., 2022): MIT License.


• clean-fid[14] (Parmar et al., 2022): MIT License.


• SDv2-inpainting[15] (Rombach et al., 2022): CreativeML Open RAIL++-M License.


• SDXL-base-v1.0[16] (Podell et al., 2023): CreativeML Open RAIL++-M License.


• Improved Aesthetic Predictor[17]: Apache 2.0 License.


image

Figure 6: Additional Comparison Results.

Figure 6: Additional Comparison Results.


image


Figure 7: Additional Comparison Results.

Figure 7: Additional Comparison Results.


image


Figure 8: Additional Comparison Results.

Figure 8: Additional Comparison Results.


image


Figure 9: Additional Comparison Results.

Figure 9: Additional Comparison Results.


image


image


Figure 10: Additional Qualitative Results on Zero-Shot MS-COCO Validation.

Figure 10: Additional Qualitative Results on Zero-Shot MS-COCO Validation.


image


image


Figure 11: Additional Qualitative Results on Zero-Shot MS-COCO Validation.

Figure 11: Additional Qualitative Results on Zero-Shot MS-COCO Validation.


image


image


Figure 12: Additional Qualitative Results on Zero-Shot MS-COCO Validation.

Figure 12: Additional Qualitative Results on Zero-Shot MS-COCO Validation.


This paper is available on arxiv under CC BY 4.0 DEED license.


[2]https://laion.ai/blog/laion-5b/

[3]https://github.com/kakaobrain/coyo-dataset

[4]https://cocodataset.org/#home

[5]https://github.com/huggingface/diffusers

[6]https://github.com/openai/CLIP

[7]https://huggingface.co/stabilityai/stable-diffusion-2-base

[8]https://huggingface.co/hustvl/yolos-tiny

[9]https://huggingface.co/Salesforce/blip2-opt-2.7b

[10]https://github.com/open-mmlab/mmpose

[11]https://github.com/ViTAE-Transformer/ViTPose

[12]https://github.com/EPFL-VILAB/omnidata

[13]https://github.com/isl-org/MiDaS

[14]https://github.com/GaParmar/clean-fid

[15]https://huggingface.co/stabilityai/stable-diffusion-2-inpainting [16]https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 [17]https://github.com/christophschuhmann/improved-aesthetic-predictor

L O A D I N G
. . . comments & more!

About Author

Homology Technology FTW HackerNoon profile picture
Homology Technology FTW@homology
Unlocking shared blueprints with Homology, revealing evolutionary connections for a deeper understanding.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
Hackernoon
Threads
Bsky
X REMOVE AD