arXiv Open Access 2025

CoD: A Diffusion Foundation Model for Image Compression

Zhaoyang Jia Zihan Zheng Naifu Xue Jiahao Li Bin Li +4 lainnya

Lihat Sumber

Abstrak

Existing diffusion codecs typically build on text-to-image diffusion foundation models like Stable Diffusion. However, text conditioning is suboptimal from a compression perspective, hindering the potential of downstream diffusion codecs, particularly at ultra-low bitrates. To address it, we introduce \textbf{CoD}, the first \textbf{Co}mpression-oriented \textbf{D}iffusion foundation model, trained from scratch to enable end-to-end optimization of both compression and generation. CoD is not a fixed codec but a general foundation model designed for various diffusion-based codecs. It offers several advantages: \textbf{High compression efficiency}, replacing Stable Diffusion with CoD in downstream codecs like DiffC achieves SOTA results, especially at ultra-low bitrates (e.g., 0.0039 bpp); \textbf{Low-cost and reproducible training}, 300$\times$ faster training than Stable Diffusion ($\sim$ 20 vs. $\sim$ 6,250 A100 GPU days) on entirely open image-only datasets; \textbf{Providing new insights}, e.g., We find pixel-space diffusion can achieve VTM-level PSNR with high perceptual quality and can outperform GAN-based codecs using fewer parameters. We hope CoD lays the foundation for future diffusion codec research. Codes are released at https://github.com/microsoft/GenCodec/tree/main/CoD.

Topik & Kata Kunci

cs.CV

Penulis (9)

Zhaoyang Jia

Zihan Zheng

Naifu Xue

Jiahao Li

Bin Li

Zongyu Guo

Xiaoyi Zhang

Houqiang Li

Yan Lu

Format Sitasi

APA MLA BibTeX

Jia, Z., Zheng, Z., Xue, N., Li, J., Li, B., Guo, Z. et al. (2025). CoD: A Diffusion Foundation Model for Image Compression. https://arxiv.org/abs/2511.18706

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓