WeChat Mini Program
Old Version Features

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Computing Research Repository (CoRR)(2024)

Google

Cited 74|Views1365
Abstract
The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image diffusion model obtained through extensive optimizations in both architecture and sampling techniques. We conduct a comprehensive examination of model architecture design to reduce redundancy, enhance computational efficiency, and minimize model's parameter count, while preserving image generation quality. Additionally, we employ distillation and diffusion-GAN finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference respectively. Empirical studies, conducted both quantitatively and qualitatively, demonstrate the effectiveness of our proposed techniques. MobileDiffusion achieves a remarkable \textbf{sub-second} inference speed for generating a $512\times512$ image on mobile devices, establishing a new state of the art.
More
Translated text
Key words
Texture Synthesis,Transfer Learning,Image Inpainting,Image Synthesis
PDF
Bibtex
AI Read Science
Video&Figures
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了MobileDiffusion,一种经过架构和采样技术优化,能在移动设备上实现即时文本到图像生成的扩散模型,实现了亚秒级生成速度,达到了新的技术水平。

方法】:作者对模型架构进行了全面审查,以减少冗余、提高计算效率并最小化模型参数数量,同时保持图像生成质量,并采用了蒸馏和扩散GAN微调技术。

实验】:通过定性和定量的实证研究,使用未具体提及的数据集,MobileDiffusion在移动设备上生成$512\times512$图像达到了令人瞩目的亚秒级推断速度。