DiffVC-OSD: One-Step Diffusion-based Perceptual Neural Video Compression Framework

Ma, Wenzhuo; Chen, Zhenzhong

计算机科学 > 计算机视觉与模式识别

arXiv:2508.07682 (cs)

[提交于 2025年8月11日 ]

标题： DiffVC-OSD：一步扩散式感知神经视频压缩框架

标题： DiffVC-OSD: One-Step Diffusion-based Perceptual Neural Video Compression Framework

Authors:Wenzhuo Ma, Zhenzhong Chen

摘要：在本工作中，我们首先提出DiffVC-OSD，一种单步扩散的感知神经视频压缩框架。与传统的多步扩散方法不同，DiffVC-OSD将重建的潜在表示直接输入到单步扩散模型中，通过时间上下文和潜在表示本身的引导，提高感知质量。为了更好地利用时间依赖性，我们设计了一个时间上下文适配器，将条件输入编码为多级特征，为去噪U-Net提供更细粒度的指导。此外，我们采用端到端微调策略以提高整体压缩性能。大量实验表明，DiffVC-OSD实现了最先进的感知压缩性能，相比对应的多步扩散变体，解码速度提高了约20$\times$，比特率降低了86.92%。

摘要： In this work, we first propose DiffVC-OSD, a One-Step Diffusion-based Perceptual Neural Video Compression framework. Unlike conventional multi-step diffusion-based methods, DiffVC-OSD feeds the reconstructed latent representation directly into a One-Step Diffusion Model, enhancing perceptual quality through a single diffusion step guided by both temporal context and the latent itself. To better leverage temporal dependencies, we design a Temporal Context Adapter that encodes conditional inputs into multi-level features, offering more fine-grained guidance for the Denoising Unet. Additionally, we employ an End-to-End Finetuning strategy to improve overall compression performance. Extensive experiments demonstrate that DiffVC-OSD achieves state-of-the-art perceptual compression performance, offers about 20$\times$ faster decoding and a 86.92\% bitrate reduction compared to the corresponding multi-step diffusion-based variant.

主题：	计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2508.07682 [cs.CV]
	(或者 arXiv:2508.07682v1 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.07682

提交历史

来自： Wenzhuo Ma [查看电子邮件]
[v1] 星期一， 2025 年 8 月 11 日 06:59:23 UTC (535 KB)

计算机科学 > 计算机视觉与模式识别

标题： DiffVC-OSD：一步扩散式感知神经视频压缩框架

标题： DiffVC-OSD: One-Step Diffusion-based Perceptual Neural Video Compression Framework

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： DiffVC-OSD：一步扩散式感知神经视频压缩框架 显示英文标题

标题： DiffVC-OSD: One-Step Diffusion-based Perceptual Neural Video Compression Framework

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： DiffVC-OSD：一步扩散式感知神经视频压缩框架