效果非常不错!阿里昨开源图形海报生成模型Qwen-Image

模型介绍

我们隆重推出Qwen-Image——基于20B参数MMDiT架构的多模态图像基础模型，在复杂文本渲染和精确图像编辑方面实现重大突破。实验表明，该模型在图像生成与编辑任务中均展现出卓越的通用能力，尤其在中文文本渲染方面表现优异。

快速开始

确保安装transformers>=4.51.3（支持Qwen2.5-VL架构）
安装最新版diffusers

ounter(line pip install git+https://github.com/huggingface/diffusers

以下代码示例展示如何基于文本提示生成图像：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(line from diffusers import DiffusionPipeline import torch model_name = "Qwen/Qwen-Image" # 初始化生成管道 if torch.cuda.is_available(): torch_dtype = torch.bfloat16 device = "cuda" else: torch_dtype = torch.float32 device = "cpu" pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype) pipe = pipe.to(device) positive_magic = { "en": "Ultra HD, 4K, cinematic composition.", # 英文提示增强 "zh": "超清，4K，电影级构图" # 中文提示增强 } # 生成图像示例 prompt = '''咖啡店门口放置着黑板招牌，上面写着"Qwen咖啡 😊 每杯2美元"，旁边霓虹灯显示"通义千问"。招牌下方张贴着中国美女海报，海报底部标注"π≈3.1415926-53589793-23846264-33832795-02384197"。''' negative_prompt = " " # 若无负面提示需求建议保留空格 # 支持多种宽高比 aspect_ratios = { "1:1": (1328, 1328), "16:9": (1664, 928), "9:16": (928, 1664), "4:3": (1472, 1104), "3:4": (1104, 1472), "3:2": (1584, 1056), "2:3": (1056, 1584), } width, height = aspect_ratios["16:9"] image = pipe( prompt=prompt + positive_magic["zh"], negative_prompt=negative_prompt, width=width, height=height, num_inference_steps=50, true_cfg_scale=4.0, generator=torch.Generator(device="cuda").manual_seed(42) ).images[0] image.save("示例图片.png")