README.md · HunyuanImage-3.0

1

---

2

license: other

3

license_name: tencent-hunyuan-community

4

license_link: LICENSE

5

pipeline_tag: text-to-image

6

library_name: transformers

7

---

8

9

10

[中文文档](./README_zh_CN.md)

11

12

<div align="center">

13

14

<img src="./assets/logo.png" alt="HunyuanImage-3.0 Logo" width="600">

15

16

# 🎨 HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

17

18

</div>

19

20

21

<div align="center">

22

<img src="./assets/banner.png" alt="HunyuanImage-3.0 Banner" width="800">

23

24

</div>

25

26

<div align="center">

27

  <a href=https://hunyuan.tencent.com/image target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>

28

  <a href=https://huggingface.co/tencent/HunyuanImage-3.0 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>

29

  <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>

30

  <a href=https://arxiv.org/pdf/2509.23951 target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>

31

  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>

32

  <a href=https://docs.qq.com/doc/DUVVadmhCdG9qRXBU target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a>

33

</div>

34

35

36

<p align="center">

37

    👏 Join our <a href="./assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |

38

💻 <a href="https://hunyuan.tencent.com/modelSquare/home/play?modelId=289&from=/visual">Official website(官网) Try our model!</a>&nbsp&nbsp

39

</p>

40

41

## 🔥🔥🔥 News

42

43

- **January 26, 2026**: 🚀 **[HunyuanImage-3.0-Instruct-Distil](https://huggingface.co/tencent/HunyuanImage-3.0-Instruct-Distil)** - Distilled checkpoint for efficient deployment (8 steps sampling recommended).

44

- **January 26, 2026**: 🎉 **[HunyuanImage-3.0-Instruct](https://huggingface.co/tencent/HunyuanImage-3.0-Instruct)** - Release of **Instruct (with reasoning)** for intelligent prompt enhancement and **Image-to-Image** generation for creative editing.

45

- **October 30, 2025**: 🚀 **[HunyuanImage-3.0 vLLM Acceleration](./vllm_infer/README.md)** - Significantly faster inference with vLLM support.

46

- **September 28, 2025**: 📖 **[HunyuanImage-3.0 Technical Report](https://arxiv.org/pdf/2509.23951)** - Comprehensive technical documentation now available.

47

- **September 28, 2025**: 🎉 **[HunyuanImage-3.0 Open Source](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0)** - Inference code and model weights publicly available.

48

49

50

## 🧩 Community Contributions

51

52

If you develop/use HunyuanImage-3.0 in your projects, welcome to let us know.

53

54

## 📑 Open-source Plan

55

56

- HunyuanImage-3.0 (Image Generation Model)

57

- [x] Inference

58

- [x] HunyuanImage-3.0 Checkpoints

59

- [x] HunyuanImage-3.0-Instruct Checkpoints (with reasoning)

60

- [x] vLLM Support

61

- [x] Distilled Checkpoints

62

- [x] Image-to-Image Generation

63

- [ ] Multi-turn Interaction

64

65

66

## 🗂️ Contents

67

- [🔥🔥🔥 News](#-news)

68

- [🧩 Community Contributions](#-community-contributions)

69

- [📑 Open-source Plan](#-open-source-plan)

70

- [📖 Introduction](#-introduction)

71

- [✨ Key Features](#-key-features)

72

- [🚀 Usage](#-usage)

73

- [📦 Environment Setup](#-environment-setup)

74

- [📥 Install Dependencies](#-install-dependencies)

75

- [HunyuanImage-3.0 (Text-to-image)](#hunyuanimage-30-text-to-image)

76

- [🔥 Quick Start with Transformers](#-quick-start-with-transformers)

77

- [1️⃣ Download model weights](#1-download-model-weights)

78

- [2️⃣ Run with Transformers](#2-run-with-transformers)

79

- [🏠 Local Installation & Usage](#-local-installation--usage)

80

- [1️⃣ Clone the Repository](#1-clone-the-repository)

81

- [2️⃣ Download Model Weights](#2-download-model-weights)

82

- [3️⃣ Run the Demo](#3-run-the-demo)

83

- [4️⃣ Command Line Arguments](#4-command-line-arguments)

84

- [🎨 Interactive Gradio Demo](#-interactive-gradio-demo)

85

- [1️⃣ Install Gradio](#1-install-gradio)

86

- [2️⃣ Configure Environment](#2-configure-environment)

87

- [3️⃣ Launch the Web Interface](#3-launch-the-web-interface)

88

- [4️⃣ Access the Interface](#4-access-the-interface)

89

  - [HunyuanImage-3.0-Instruct](#hunyuanimage-30-instruct-instruction-reasoning-and-image-to-image-generation-including-editing-and-multi-image-fusion)

90

- [🔥 Quick Start with Transformers](#-quick-start-with-transformers-1)

91

- [1️⃣ Download model weights](#1-download-model-weights-1)

92

- [2️⃣ Run with Transformers](#2-run-with-transformers-1)

93

- [🏠 Local Installation & Usage](#-local-installation--usage-1)

94

- [1️⃣ Clone the Repository](#1-clone-the-repository-1)

95

- [2️⃣ Download Model Weights](#2-download-model-weights-1)

96

- [3️⃣ Run the Demo](#3-run-the-demo-1)

97

- [4️⃣ Command Line Arguments](#4-command-line-arguments-1)

98

- [5️⃣ For fewer Sampling Steps](#5-for-fewer-sampling-steps)

99

- [🧱 Models Cards](#-models-cards)

100

- [📊 Evaluation](#-evaluation)

101

- [Evaluation of HunyuanImage-3.0-Instruct](#evaluation-of-hunyuanimage-30-instruct)

102

- [Evaluation of HunyuanImage-3.0 (Text-to-Image)](#evaluation-of-hunyuanimage-30-text-to-image)

103

- [🖼️ Showcase](#-showcase)

104

- [Showcases of HunyuanImage-3.0-Instruct](#showcases-of-hunyuanimage-30-instruct)

105

- [📚 Citation](#-citation)

106

- [🙏 Acknowledgements](#-acknowledgements)

107

- [🌟🚀 Github Star History](#-github-star-history)

108

109

---

110

111

## 📖 Introduction

112

113

**HunyuanImage-3.0** is a groundbreaking native multimodal model that unifies multimodal understanding and generation within an autoregressive framework. Our text-to-image and image-to-image model achieves performance **comparable to or surpassing** leading closed-source models.

114

115

116

<div align="center">

117

<img src="./assets/framework.png" alt="HunyuanImage-3.0 Framework" width="90%">

118

</div>

119

120

## ✨ Key Features

121

122

* 🧠 **Unified Multimodal Architecture:** Moving beyond the prevalent DiT-based architectures, HunyuanImage-3.0 employs a unified autoregressive framework. This design enables a more direct and integrated modeling of text and image modalities, leading to surprisingly effective and contextually rich image generation.

123

124

* 🏆 **The Largest Image Generation MoE Model:** This is the largest open-source image generation Mixture of Experts (MoE) model to date. It features 64 experts and a total of 80 billion parameters, with 13 billion activated per token, significantly enhancing its capacity and performance.

125

126

* 🎨 **Superior Image Generation Performance:** Through rigorous dataset curation and advanced reinforcement learning post-training, we've achieved an optimal balance between semantic accuracy and visual excellence. The model demonstrates exceptional prompt adherence while delivering photorealistic imagery with stunning aesthetic quality and fine-grained details.

127

128

* 💭 **Intelligent Image Understanding and World-Knowledge Reasoning:** The unified multimodal architecture endows HunyuanImage-3.0 with powerful reasoning capabilities. It under stands user's input image, and leverages its extensive world knowledge to intelligently interpret user intent, automatically elaborating on sparse prompts with contextually appropriate details to produce superior, more complete visual outputs.

129

130

131

## 🚀 Usage

132

133

### 📦 Environment Setup

134

135

* 🐍 **Python:** 3.12+ (recommended and tested)

136

* ⚡ **CUDA:** 12.8

137

138

#### 📥 Install Dependencies

139

140

```bash

141

# 1. First install PyTorch (CUDA 12.8 Version)

142

pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128

143

144

# 2. Install tencentcloud-sdk for Prompt Enhancement (PE) only for HunyuanImage-3.0 not HunyuanImage-3.0-Instruct

145

pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-sdk-python

146

147

# 3. Then install other dependencies

148

pip install -r requirements.txt

149

```

150

151

For **up to 3x faster inference**, install these optimizations:

152

153

```bash

154

# FlashInfer for optimized moe inference. v0.5.0 is tested.

155

pip install flashinfer-python==0.5.0

156

```

157

> 💡**Installation Tips:** It is critical that the CUDA version used by PyTorch matches the system's CUDA version.

158

> FlashInfer relies on this compatibility when compiling kernels at runtime.

159

> GCC version >=9 is recommended for compiling FlashAttention and FlashInfer.

160

161

> ⚡ **Performance Tips:** These optimizations can significantly speed up your inference!

162

163

> 💡**Notation:** When FlashInfer is enabled, the first inference may be slower (about 10 minutes) due to kernel compilation. Subsequent inferences on the same machine will be much faster.

164

165

### HunyuanImage-3.0 (Text-to-image)

166

167

#### 🔥 Quick Start with Transformers

168

169

##### 1️⃣ Download model weights

170

171

```bash

172

# Download from HuggingFace and rename the directory.

173

# Notice that the directory name should not contain dots, which may cause issues when loading using Transformers.

174

hf download tencent/HunyuanImage-3.0 --local-dir ./HunyuanImage-3

175

```

176

177

##### 2️⃣ Run with Transformers

178

179

```python

180

from transformers import AutoModelForCausalLM

181

182

# Load the model

183

model_id = "./HunyuanImage-3"

184

# Currently we can not load the model using HF model_id `tencent/HunyuanImage-3.0` directly

185

# due to the dot in the name.

186

187

kwargs = dict(

188

attn_implementation="sdpa", # Use "flash_attention_2" if FlashAttention is installed

189

trust_remote_code=True,

190

torch_dtype="auto",

191

device_map="auto",

192

moe_impl="eager", # Use "flashinfer" if FlashInfer is installed

193

)

194

195

model = AutoModelForCausalLM.from_pretrained(model_id, **kwargs)

196

model.load_tokenizer(model_id)

197

198

# generate the image

199

prompt = "A brown and white dog is running on the grass"

200

image = model.generate_image(prompt=prompt, stream=True)

201

image.save("image.png")

202

```

203

204

205

#### 🏠 Local Installation & Usage

206

207

##### 1️⃣ Clone the Repository

208

209

```bash

210

git clone https://github.com/Tencent-Hunyuan/HunyuanImage-3.0.git

211

cd HunyuanImage-3.0/

212

```

213

214

##### 2️⃣ Download Model Weights

215

216

```bash

217

# Download from HuggingFace

218

hf download tencent/HunyuanImage-3.0 --local-dir ./HunyuanImage-3

219

```

220

221

##### 3️⃣ Run the Demo

222

The Pretrain Checkpoint does not automatically rewrite or enhance input prompts, for optimal results currently, we recommend community partners to use deepseek to rewrite the prompts. You can go to [Tencent Cloud](https://cloud.tencent.com/document/product/1772/115963#.E5.BF.AB.E9.80.9F.E6.8E.A5.E5.85.A5) to apply for an API Key.

223

224

```bash

225

# Without PE

226

export MODEL_PATH="./HunyuanImage-3"

227

python3 run_image_gen.py \

228

--model-id $MODEL_PATH \

229

--verbose 1 \

230

--prompt "A brown and white dog is running on the grass" \

231

--bot-task image \

232

--image-size "1024x1024" \

233

--save ./image.png \

234

--moe-impl flashinfer

235

236

# With PE

237

export DEEPSEEK_KEY_ID="your_deepseek_key_id"

238

export DEEPSEEK_KEY_SECRET="your_deepseek_key_secret"

239

export MODEL_PATH="./HunyuanImage-3"

240

python3 run_image_gen.py \

241

--model-id $MODEL_PATH \

242

--verbose 1 \

243

--prompt "A brown and white dog is running on the grass" \

244

--bot-task image \

245

--image-size "1024x1024" \

246

--save ./image.png \

247

--moe-impl flashinfer \

248

--rewrite 1

249

250

```

251

252

##### 4️⃣ Command Line Arguments

253

254

| Arguments | Description | Recommended |

255

| ----------------------- | ------------------------------------------------------------ | ----------- |

256

| `--prompt` | Input prompt | (Required) |

257

| `--model-id` | Model path | (Required) |

258

| `--attn-impl` | Attention implementation. Either `sdpa` or `flash_attention_2`. | `sdpa` |

259

| `--moe-impl` | MoE implementation. Either `eager` or `flashinfer` | `flashinfer` |

260

| `--seed` | Random seed for image generation | `None` |

261

| `--diff-infer-steps` | Diffusion infer steps | `50` |

262

| `--image-size` | Image resolution. Can be `auto`, like `1280x768` or `16:9` | `auto` |

263

| `--save` | Image save path. | `image.png` |

264

| `--verbose` | Verbose level. 0: No log; 1: log inference information. | `0` |

265

| `--rewrite` | Whether to enable rewriting | `1` |

266

267

#### 🎨 Interactive Gradio Demo

268

269

Launch an interactive web interface for easy text-to-image generation.

270

271

##### 1️⃣ Install Gradio

272

273

```bash

274

pip install gradio>=4.21.0

275

```

276

277

##### 2️⃣ Configure Environment

278

279

```bash

280

# Set your model path

281

export MODEL_ID="path/to/your/model"

282

283

# Optional: Configure GPU usage (default: 0,1,2,3)

284

export GPUS="0,1,2,3"

285

286

# Optional: Configure host and port (default: 0.0.0.0:443)

287

export HOST="0.0.0.0"

288

export PORT="443"

289

```

290

291

##### 3️⃣ Launch the Web Interface

292

293

**Basic Launch:**

294

```bash

295

sh run_app.sh

296

```

297

298

**With Performance Optimizations:**

299

```bash

300

# Use both optimizations for maximum performance

301

sh run_app.sh --moe-impl flashinfer --attn-impl flash_attention_2

302

```

303

304

##### 4️⃣ Access the Interface

305

306

> 🌐 **Web Interface:** Open your browser and navigate to `http://localhost:443` (or your configured port)

307

308

309

310

<details>

311

<summary> Latest Version (Image-to-image & Text-image-to-image) </summary>

312

313

### HunyuanImage-3.0-Instruct (Instruction reasoning and Image-to-image generation, including editing and multi-image fusion)

314

315

#### 🔥 Quick Start with Transformers

316

317

##### 1️⃣ Download model weights

318

319

```bash

320

# Download from HuggingFace and rename the directory.

321

# Notice that the directory name should not contain dots, which may cause issues when loading using Transformers.

322

hf download tencent/HunyuanImage-3.0-Instruct --local-dir ./HunyuanImage-3-Instruct

323

```

324

325

##### 2️⃣ Run with Transformers

326

327

```python

328

from transformers import AutoModelForCausalLM

329

330

# Load the model

331

model_id = "./HunyuanImage-3-Instruct"

332

# Currently we can not load the model using HF model_id `tencent/HunyuanImage-3.0-Instruct` directly

333

# due to the dot in the name.

334

335

kwargs = dict(

336

attn_implementation="sdpa",

337

trust_remote_code=True,

338

torch_dtype="auto",

339

device_map="auto",

340

moe_impl="eager", # Use "flashinfer" if FlashInfer is installed

341

moe_drop_tokens=True,

342

)

343

344

model = AutoModelForCausalLM.from_pretrained(model_id, **kwargs)

345

model.load_tokenizer(model_id)

346

347

# Image-to-Image generation (TI2I)

348

prompt = "基于图一的logo，参考图二中冰箱贴的材质，制作一个新的冰箱贴"

349

350

input_img1 = "./assets/demo_instruct_imgs/input_1_0.png"

351

input_img2 = "./assets/demo_instruct_imgs/input_1_1.png"

352

imgs_input = [input_img1, input_img2]

353

354

cot_text, samples = model.generate_image(

355

prompt=prompt,

356

image=imgs_input,

357

seed=42,

358

image_size="auto",

359

use_system_prompt="en_unified",

360

bot_task="think_recaption", # Use "think_recaption" for reasoning and enhancement

361

infer_align_image_size=True, # Align output image size to input image size

362

diff_infer_steps=50,

363

verbose=2

364

)

365

366

# Save the generated image

367

samples[0].save("image_edit.png")

368

```

369

370

#### 🏠 Local Installation & Usage

371

372

##### 1️⃣ Clone the Repository

373

374

```bash

375

git clone https://github.com/Tencent-Hunyuan/HunyuanImage-3.0.git

376

cd HunyuanImage-3.0/

377

```

378

379

##### 2️⃣ Download Model Weights

380

381

```bash

382

# Download from HuggingFace

383

hf download tencent/HunyuanImage-3.0-Instruct --local-dir ./HunyuanImage-3-Instruct

384

```

385

386

##### 3️⃣ Run the Demo

387

388

More demos in `run_demo_instruct.sh`.

389

390

```bash

391

export MODEL_PATH="./HunyuanImage-3-Instruct"

392

bash run_demo_instruct.sh

393

```

394

395

##### 4️⃣ Command Line Arguments

396

397

| Arguments | Description | Recommended |

398

| ----------------------- | ------------------------------------------------------------ | ----------- |

399

| `--prompt` | Input prompt | (Required) |

400

| `--image`               | Image to run. For multiple images, use comma-separated paths (e.g., 'img1.png,img2.png') | (Required)      |

401

| `--model-id` | Model path | (Required) |

402

| `--attn-impl` | Attention implementation. Now only support 'sdpa' | `sdpa` |

403

| `--moe-impl` | MoE implementation. Either `eager` or `flashinfer` | `flashinfer` |

404

| `--seed` | Random seed for image generation. Use None for random seed | `None` |

405

| `--diff-infer-steps` | Number of inference steps | `50` |

406

| `--image-size` | Image resolution. Can be `auto`, like `1280x768` or `16:9` | `auto` |

407

| `--use-system-prompt`   | System prompt type. Options: `None`, `dynamic`, `en_vanilla`, `en_recaption`, `en_think_recaption`, `en_unified`, `custom` | `en_unified` |

408

| `--system-prompt` | Custom system prompt. Used when `--use-system-prompt` is `custom` | `None` |

409

| `--bot-task`            | Task type. `image` for direct generation; `auto` for text; `recaption` for re-write->image; `think_recaption` for think->re-write->image | `think_recaption` |

410

| `--save` | Image save path | `image.png` |

411

| `--verbose` | Verbose level | `2` |

412

| `--reproduce` | Whether to reproduce the results | `True` |

413

| `--infer-align-image-size` | Whether to align the target image size to the src image size | `True` |

414

| `--max_new_tokens` | Maximum number of new tokens to generate | `2048` |

415

| `--use-taylor-cache` | Use Taylor Cache when sampling | `False` |

416

417

##### 5️⃣ For fewer Sampling Steps

418

419

We recommend using the model [HunyuanImage-3.0-Instruct-Distil](https://huggingface.co/tencent/HunyuanImage-3.0-Instruct-Distil) with `--diff-infer-steps 8`, while keeping all other recommended parameter values **unchanged**.

420

421

```bash

422

# Download HunyuanImage-3.0-Instruct-Distil from HuggingFace

423

hf download tencent/HunyuanImage-3.0-Instruct-Distil --local-dir ./HunyuanImage-3-Instruct-Distil

424

425

# Run the demo with 8 steps to samples

426

export MODEL_PATH="./HunyuanImage-3-Instruct-Distil"

427

bash run_demo_instruct_Distil.sh

428

```

429

430

</details>

431

432

## 🧱 Models Cards

433

434

## 📊 Evaluation

435

436

### Evaluation of HunyuanImage-3.0-Instruct

437

* 👥 **GSB (Human Evaluation)**

438

We adopted the GSB (Good/Same/Bad) evaluation method commonly used to assess the relative performance between two models from an overall image perception perspective. In total, we utilized 1,000+ single- and multi-images editing cases, generating an equal number of image samples for all compared models in a single run. For a fair comparison, we conducted inference only once for each prompt, avoiding any cherry-picking of results. When comparing with the baseline methods, we maintained the default settings for all selected models. The evaluation was performed by more than 100 professional evaluators.

439

440

<p align="center">

441

<img src="./assets/gsb_instruct.png" width=60% alt="Human Evaluation with Other Models">

442

</p>

443

444

445

### Evaluation of HunyuanImage-3.0 (Text-to-Image)

446

447

* 🤖 **SSAE (Machine Evaluation)**

448

SSAE (Structured Semantic Alignment Evaluation) is an intelligent evaluation metric for image-text alignment based on advanced multimodal large language models (MLLMs). We extracted 3500 key points across 12 categories, then used multimodal large language models to automatically evaluate and score by comparing the generated images with these key points based on the visual content of the images. Mean Image Accuracy represents the image-wise average score across all key points, while Global Accuracy directly calculates the average score across all key points.

449

450

<p align="center">

451

<img src="./assets/ssae_side_by_side_comparison.png" width=98% alt="Human Evaluation with Other Models">

452

</p>

453

454

<p align="center">

455

<img src="./assets/ssae_side_by_side_heatmap.png" width=98% alt="Human Evaluation with Other Models">

456

</p>

457

458

459

* 👥 **GSB (Human Evaluation)**

460

461

We adopted the GSB (Good/Same/Bad) evaluation method commonly used to assess the relative performance between two models from an overall image perception perspective. In total, we utilized 1,000 text prompts, generating an equal number of image samples for all compared models in a single run. For a fair comparison, we conducted inference only once for each prompt, avoiding any cherry-picking of results. When comparing with the baseline methods, we maintained the default settings for all selected models. The evaluation was performed by more than 100 professional evaluators.

462

463

<p align="center">

464

<img src="./assets/gsb.png" width=98% alt="Human Evaluation with Other Models">

465

</p>

466

467

## 🖼️ Showcase

468

469

Our model can follow complex instructions to generate high‑quality, creative images.

470

471

<div align="center">

472

<img src="./assets/banner_all.jpg" width=100% alt="HunyuanImage 3.0 Demo">

473

</div>

474

475

For text-to-image showcases in HunyuanImage-3.0, click the following links:

476

477

- [HunyuanImage-3.0](./Hunyuan-Image3.md)

478

479

### Showcases of HunyuanImage-3.0-Instruct

480

481

HunyuanImage-3.0-Instruct demonstrates powerful capabilities in intelligent image generation and editing. The following showcases highlight its core features:

482

483

* 🧠 **Intelligent Visual Understanding and Reasoning (CoT Think)**: The model performs structured thinking to analyze user's input image and prompt, expand user's intent and editing tasks into a stucture, comprehnsive instructions, and leading to a better image generation and editing performance.

484

485

breaking down complex prompts and editing tasks into detailed visual components including subject, composition, lighting, color palette, and style.

486

487

* ✏️ **Prompt Self-Rewrite**: Automatically enhances sparse or vague prompts into professional-grade, detail-rich descriptions that capture the user's intent more accurately.

488

489

* 🎨 **Text-to-Image (T2I)**: Generates high-quality images from text prompts with exceptional prompt adherence and photorealistic quality.

490

491

* 🖼️ **Image-to-Image (TI2I)**: Supports creative image editing, including adding elements, removing objects, modifying styles, and seamless background replacement while preserving key visual elements.

492

493

* 🔀 **Multi-Image Fusion**: Intelligently combines multiple reference images (up to 3 inputs) to create coherent composite images that integrate visual elements from different sources.

494

495

496

**Showcase 1: Detailed Thought and Reasoning Process**

497

498

<div align="center">

499

<img src="./assets/pg_instruct_imgs/cot_ti2i.gif" alt="HunyuanImage-3.0-Instruct Showcase 1" width="90%">

500

</div>

501

502

**Showcase 2: Creative T2I Generation with Complex Scene Understanding**

503

504

> Prompt: 3D 毛绒质感拟人化马，暖棕浅棕肌理，穿藏蓝西装、白衬衫，戴深棕手套；疲惫带期待，坐于电脑前，旁置印 "HAPPY AGAIN" 的马克杯。橙红渐变背景，配超大号藏蓝粗体 "马上下班"，叠加米黄 "Happy New Year" 并标 "(2026)"。橙红为主，藏蓝米黄撞色，毛绒温暖柔和。

505

506

<div align="center">

507

<img src="./assets/pg_instruct_imgs/image0.png" alt="HunyuanImage-3.0-Instruct Showcase 2" width="75%">

508

</div>

509

510

**Showcase 3: Precise Image Editing with Element Preservation**

511

512

<div align="center">

513

<img src="./assets/pg_instruct_imgs/image1.png" alt="HunyuanImage-3.0-Instruct Showcase 3" width="85%">

514

</div>

515

516

**Showcase 4: Style Transformation with Thematic Enhancement**

517

518

<div align="center">

519

<img src="./assets/pg_instruct_imgs/image2.png" alt="HunyuanImage-3.0-Instruct Showcase 4" width="85%">

520

</div>

521

522

523

**Showcase 5: Advanced Style Transfer and Product Mockup Generation**

524

525

<div align="center">

526

<img src="./assets/pg_instruct_imgs/image3.png" alt="HunyuanImage-3.0-Instruct Showcase 5" width="85%">

527

</div>

528

529

530

**Showcase 6: Multi-Image Fusion and Creative Composition**

531

532

<div align="center">

533

<img src="./assets/pg_instruct_imgs/image4.png" alt="HunyuanImage-3.0-Instruct Showcase 6" width="85%">

534

</div>

535

536

537

## 📚 Citation

538

539

If you find HunyuanImage-3.0 useful in your research, please cite our work:

540

541

```bibtex

542

@article{cao2025hunyuanimage,

543

title={HunyuanImage 3.0 Technical Report},

544

  author={Cao, Siyu and Chen, Hangting and Chen, Peng and Cheng, Yiji and Cui, Yutao and Deng, Xinchi and Dong, Ying and Gong, Kipper and Gu, Tianpeng and Gu, Xiusen and others},

545

journal={arXiv preprint arXiv:2509.23951},

546

year={2025}

547

}

548

```

549

550

## 🙏 Acknowledgements

551

552

We extend our heartfelt gratitude to the following open-source projects and communities for their invaluable contributions:

553

554

* 🤗 [Transformers](https://github.com/huggingface/transformers) - State-of-the-art NLP library

555

* 🎨 [Diffusers](https://github.com/huggingface/diffusers) - Diffusion models library

556

* 🌐 [HuggingFace](https://huggingface.co/) - AI model hub and community

557

* ⚡ [FlashAttention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention

558

* 🚀 [FlashInfer](https://github.com/flashinfer-ai/flashinfer) - Optimized inference engine

559

560

## 🌟🚀 Github Star History

561

562

[![GitHub stars](https://img.shields.io/github/stars/Tencent-Hunyuan/HunyuanImage-3.0?style=social)](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0)

563

[![GitHub forks](https://img.shields.io/github/forks/Tencent-Hunyuan/HunyuanImage-3.0?style=social)](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0)

564

565

566

[![Star History Chart](https://api.star-history.com/svg?repos=Tencent-Hunyuan/HunyuanImage-3.0&type=Date)](https://www.star-history.com/#Tencent-Hunyuan/HunyuanImage-3.0&Date)

567