README.md
5.7 KB · 135 lines · markdown Raw
1 ---
2 pipeline_tag: text-to-image
3 inference: false
4 license: other
5 license_name: sai-nc-community
6 license_link: https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md
7 ---
8
9 # SDXL-Turbo Model Card
10
11 <!-- Provide a quick summary of what the model is/does. -->
12 ![row01](output_tile.jpg)
13 SDXL-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation.
14 A real-time demo is available here: http://clipdrop.co/stable-diffusion-turbo
15
16 Please note: For commercial use, please refer to https://stability.ai/license.
17
18 ## Model Details
19
20 ### Model Description
21 SDXL-Turbo is a distilled version of [SDXL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), trained for real-time synthesis.
22 SDXL-Turbo is based on a novel training method called Adversarial Diffusion Distillation (ADD) (see the [technical report](https://stability.ai/research/adversarial-diffusion-distillation)), which allows sampling large-scale foundational
23 image diffusion models in 1 to 4 steps at high image quality.
24 This approach uses score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal and combines this with an
25 adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps.
26
27 - **Developed by:** Stability AI
28 - **Funded by:** Stability AI
29 - **Model type:** Generative text-to-image model
30 - **Finetuned from model:** [SDXL 1.0 Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
31
32 ### Model Sources
33
34 For research purposes, we recommend our `generative-models` Github repository (https://github.com/Stability-AI/generative-models),
35 which implements the most popular diffusion frameworks (both training and inference).
36
37 - **Repository:** https://github.com/Stability-AI/generative-models
38 - **Paper:** https://stability.ai/research/adversarial-diffusion-distillation
39 - **Demo:** http://clipdrop.co/stable-diffusion-turbo
40
41
42 ## Evaluation
43 ![comparison1](image_quality_one_step.png)
44 ![comparison2](prompt_alignment_one_step.png)
45 The charts above evaluate user preference for SDXL-Turbo over other single- and multi-step models.
46 SDXL-Turbo evaluated at a single step is preferred by human voters in terms of image quality and prompt following over LCM-XL evaluated at four (or fewer) steps.
47 In addition, we see that using four steps for SDXL-Turbo further improves performance.
48 For details on the user study, we refer to the [research paper](https://stability.ai/research/adversarial-diffusion-distillation).
49
50
51 ## Uses
52
53 ### Direct Use
54
55 The model is intended for both non-commercial and commercial usage. You can use this model for non-commercial or research purposes under this [license](https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md). Possible research areas and tasks include
56
57 - Research on generative models.
58 - Research on real-time applications of generative models.
59 - Research on the impact of real-time generative models.
60 - Safe deployment of models which have the potential to generate harmful content.
61 - Probing and understanding the limitations and biases of generative models.
62 - Generation of artworks and use in design and other artistic processes.
63 - Applications in educational or creative tools.
64
65 For commercial use, please refer to https://stability.ai/membership.
66
67 Excluded uses are described below.
68
69 ### Diffusers
70
71 ```
72 pip install diffusers transformers accelerate --upgrade
73 ```
74
75 - **Text-to-image**:
76
77 SDXL-Turbo does not make use of `guidance_scale` or `negative_prompt`, we disable it with `guidance_scale=0.0`.
78 Preferably, the model generates images of size 512x512 but higher image sizes work as well.
79 A **single step** is enough to generate high quality images.
80
81 ```py
82 from diffusers import AutoPipelineForText2Image
83 import torch
84
85 pipe = AutoPipelineForText2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
86 pipe.to("cuda")
87
88 prompt = "A cinematic shot of a baby racoon wearing an intricate italian priest robe."
89
90 image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
91 ```
92
93 - **Image-to-image**:
94
95 When using SDXL-Turbo for image-to-image generation, make sure that `num_inference_steps` * `strength` is larger or equal
96 to 1. The image-to-image pipeline will run for `int(num_inference_steps * strength)` steps, *e.g.* 0.5 * 2.0 = 1 step in our example
97 below.
98
99 ```py
100 from diffusers import AutoPipelineForImage2Image
101 from diffusers.utils import load_image
102 import torch
103
104 pipe = AutoPipelineForImage2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
105 pipe.to("cuda")
106
107 init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((512, 512))
108
109 prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
110
111 image = pipe(prompt, image=init_image, num_inference_steps=2, strength=0.5, guidance_scale=0.0).images[0]
112 ```
113
114 ### Out-of-Scope Use
115
116 The model was not trained to be factual or true representations of people or events,
117 and therefore using the model to generate such content is out-of-scope for the abilities of this model.
118 The model should not be used in any way that violates Stability AI's [Acceptable Use Policy](https://stability.ai/use-policy).
119
120 ## Limitations and Bias
121
122 ### Limitations
123 - The generated images are of a fixed resolution (512x512 pix), and the model does not achieve perfect photorealism.
124 - The model cannot render legible text.
125 - Faces and people in general may not be generated properly.
126 - The autoencoding part of the model is lossy.
127
128
129 ### Recommendations
130
131 The model is intended for both non-commercial and commercial usage.
132
133 ## How to Get Started with the Model
134
135 Check out https://github.com/Stability-AI/generative-models