paint-brush
Cascaded Diffusion Models for Try-Onby@backpropagation

Cascaded Diffusion Models for Try-On

by BackpropagationOctober 6th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The virtual try-on system uses a three-stage diffusion model with Parallel-UNet at its core. The base model generates a 128x128 try-on result, augmented with noise conditioning. This result is upscaled to 256x256 and finally to 1024x1024 using an Efficient-UNet for super-resolution. Noise conditioning is applied at each stage to handle inaccuracies in human parsing and pose estimation, ensuring high-quality outputs.
featured image - Cascaded Diffusion Models for Try-On
Backpropagation HackerNoon profile picture

Authors:

(1) Luyang Zhu, University of Washington and Google Research, and work done while the author was an intern at Google;

(2) Dawei Yang, Google Research;

(3) Tyler Zhu, Google Research;

(4) Fitsum Reda, Google Research;

(5) William Chan, Google Research;

(6) Chitwan Saharia, Google Research;

(7) Mohammad Norouzi, Google Research;

(8) Ira Kemelmacher-Shlizerman, University of Washington and Google Research.

Abstract and 1. Introduction

2. Related Work

3. Method

3.1. Cascaded Diffusion Models for Try-On

3.2. Parallel-UNet

4. Experiments

5. Summary and Future Work and References


Appendix

A. Implementation Details

B. Additional Results

3.1. Cascaded Diffusion Models for Try-On

Our cascaded diffusion models consist of one base diffusion model and two super-resolution (SR) diffusion models.



This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.