Revolutionizing Virtual Try-On: Key Findings and Future Directions

Revolutionizing Virtual Try-On: Key Findings and Future Directions

by BackpropagationOctober 6th, 2024
The presented method for virtual try-on significantly outperforms state-of-the-art techniques in warping to new body shapes and preserving garment details, thanks to the innovative Parallel-UNet architecture. Future work will explore broader applications in image editing and video synthesis.
(1) Luyang Zhu, University of Washington and Google Research, and work done while the author was an intern at Google;

(2) Dawei Yang, Google Research;

(3) Tyler Zhu, Google Research;

(4) Fitsum Reda, Google Research;

(5) William Chan, Google Research;

(6) Chitwan Saharia, Google Research;

(7) Mohammad Norouzi, Google Research;

(8) Ira Kemelmacher-Shlizerman, University of Washington and Google Research.

Abstract and 1. Introduction

2. Related Work

3. Method

3.1. Cascaded Diffusion Models for Try-On

3.2. Parallel-UNet

4. Experiments

5. Summary and Future Work and References


A. Implementation Details

B. Additional Results

5. Summary and Future Work

We presented a method that allows to synthesize try-on given an image of a person and an image of a garment. Our results are overwhelmingly better than state-of-the-art, both in the quality of the warp to new body shapes and poses, and in the preservation of the garment. Our novel architecture Parallel-UNet, where two UNets are trained in parallel and one UNet sends information to the other via cross attentions, turned out to create state-of-the-art results. In addition to the exciting progress for the specific application of virtual try-on, we believe this architecture is going to be impactful for the general case of image editing, which we are excited to explore in the future. Finally, we believe that the architecture could also be extended to videos, which we also plan to pursue in the future.


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.