Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting

Chi, Seunggeun; Sachdeva, Enna; Huang, Pin-Hao; Lee, Kwonjoon

计算机科学 > 计算机视觉与模式识别

arXiv:2508.00427 (cs)

[提交于 2025年8月1日 ]

标题：接触感知的人机交互多区域修复方法

标题： Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting

Authors:Seunggeun Chi, Enna Sachdeva, Pin-Hao Huang, Kwonjoon Lee

摘要：模态补全，即在部分遮挡的情况下推断物体的完整外观的过程，在计算机视觉和机器人技术中对于理解复杂的人-物体交互（HOI）至关重要。现有方法，例如使用预训练扩散模型的方法，在动态场景中常常难以生成合理的补全，因为它们对HOI的理解有限。为了解决这个问题，我们开发了一种新方法，该方法结合了物理先验知识以及专为HOI设计的多区域修复技术。通过整合来自人体拓扑结构和接触信息的物理约束，我们定义了两个不同的区域：主要区域，其中遮挡的物体部分最有可能存在；次要区域，其中遮挡的可能性较低。我们的多区域修复方法在扩散模型中针对这些区域使用了定制的去噪策略。这提高了生成补全在形状和视觉细节方面的准确性和真实性。我们的实验结果表明，我们的方法在HOI场景中显著优于现有方法，使机器感知更接近人类对动态环境的理解。我们还证明，即使没有真实接触注释，我们的流程也具有鲁棒性，这扩大了其在3D重建和新视角/姿态合成等任务中的适用性。

摘要： Amodal completion, which is the process of inferring the full appearance of objects despite partial occlusions, is crucial for understanding complex human-object interactions (HOI) in computer vision and robotics. Existing methods, such as those that use pre-trained diffusion models, often struggle to generate plausible completions in dynamic scenarios because they have a limited understanding of HOI. To solve this problem, we've developed a new approach that uses physical prior knowledge along with a specialized multi-regional inpainting technique designed for HOI. By incorporating physical constraints from human topology and contact information, we define two distinct regions: the primary region, where occluded object parts are most likely to be, and the secondary region, where occlusions are less probable. Our multi-regional inpainting method uses customized denoising strategies across these regions within a diffusion model. This improves the accuracy and realism of the generated completions in both their shape and visual detail. Our experimental results show that our approach significantly outperforms existing methods in HOI scenarios, moving machine perception closer to a more human-like understanding of dynamic environments. We also show that our pipeline is robust even without ground-truth contact annotations, which broadens its applicability to tasks like 3D reconstruction and novel view/pose synthesis.

评论：	ICCV 2025（亮点）
主题：	计算机视觉与模式识别 (cs.CV) ; 人工智能 (cs.AI)
引用方式：	arXiv:2508.00427 [cs.CV]
	(或者 arXiv:2508.00427v1 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.00427

提交历史

来自： Seunggeun Chi [查看电子邮件]
[v1] 星期五， 2025 年 8 月 1 日 08:33:45 UTC (36,661 KB)

计算机科学 > 计算机视觉与模式识别

标题：接触感知的人机交互多区域修复方法

标题： Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： 接触感知的人机交互多区域修复方法 显示英文标题

标题： Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：接触感知的人机交互多区域修复方法