Modular Enhancements for Stable Diffusion Architecture

by SynthesizingOctober 3rd, 2024

Too Long; Didn't Read

This section outlines modular improvements to the Stable Diffusion architecture, applicable individually or collectively to enhance model performance. The strategies presented extend the capabilities of latent diffusion models and can also be adapted for pixel-space models.

featured image - Modular Enhancements for Stable Diffusion Architecture

Authors:

(1) Dustin Podell, Stability AI, Applied Research;

(2) Zion English, Stability AI, Applied Research;

(3) Kyle Lacey, Stability AI, Applied Research;

(4) Andreas Blattmann, Stability AI, Applied Research;

(5) Tim Dockhorn, Stability AI, Applied Research;

(6) Jonas Müller, Stability AI, Applied Research;

(7) Joe Penna, Stability AI, Applied Research;

(8) Robin Rombach, Stability AI, Applied Research.

Table of Links

Abstract and 1 Introduction

2 Improving Stable Diffusion

2.1 Architecture & Scale

2.2 Micro-Conditioning

2.3 Multi-Aspect Training

2.4 Improved Autoencoder and 2.5 Putting Everything Together

Appendix

D Comparison to the State of the Art

E Comparison to Midjourney v5.1

F On FID Assessment of Generative Text-Image Foundation Models

G Additional Comparison between Single- and Two-Stage SDXL pipeline

References

2 Improving Stable Diffusion

In this section we present our improvements for the Stable Diffusion architecture. These are modular, and can be used individually or together to extend any model. Although the following strategies are implemented as extensions to latent diffusion models (LDMs) [38], most of them are also applicable to their pixel-space counterparts.

This paper is available on arxiv under CC BY 4.0 DEED license.

L O A D I N G
. . . comments & more!

About Author

Synthesizing@synthesizing

Synthesizing weaves diverse perspectives into innovative solutions.

Read my stories Learn More

TOPICS

tech-stories #open-source-ai #latent-diffusion-model #text-to-image-synthesis #stable-diffusion #deep-generative-modeling #sdxl #pixel-space-models #ai-architecture

THIS ARTICLE WAS FEATURED IN...

Join HackerNoon

Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas

Modular Enhancements for Stable Diffusion Architecture

Too Long; Didn't Read

Table of Links

2 Improving Stable Diffusion

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES