3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks

Yang, Shitian; Li, Deyu; Jiang, Xiaoke; Zhang, Lei

计算机科学 > 计算机视觉与模式识别

arXiv:2508.01423v2 (cs)

[提交于 2025年8月2日 (v1) ，最后修订 2025年8月5日 (此版本， v2)]

标题： 3DRot：基于RGB的3D任务的3D旋转增强

标题： 3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks

Authors:Shitian Yang, Deyu Li, Xiaoke Jiang, Lei Zhang

摘要：基于RGB的3D任务，例如3D检测、深度估计、3D关键点估计，仍然受到稀缺且昂贵的标注和有限的增强工具的困扰，因为大多数图像变换，包括缩放和旋转，会破坏几何一致性。在本文中，我们引入了3DRot，一种即插即用的增强方法，在围绕相机光学中心旋转和镜像图像的同时同步更新RGB图像、相机内参、物体姿态和3D标注，以保持投影几何——实现无需依赖任何场景深度的几何一致的旋转和反射。我们通过一个经典的3D任务——单目3D检测来验证3DRot。在SUN RGB-D数据集上，3DRot将$IoU_{3D}$从43.21提高到44.51，将旋转误差（ROT）从22.91$^\circ$降低到20.93$^\circ$，并将$mAP_{0.5}$从35.70提升到38.11。作为比较，Cube R-CNN在单目3D估计中与其他3个数据集一起使用SUN RGB-D，具有相似的机制和测试数据集，将$IoU_{3D}$从36.2提高到37.8，并将$mAP_{0.5}$从34.7提高到35.4。由于它纯粹通过相机空间变换操作，3DRot可以轻松转移到其他3D任务。

摘要： RGB-based 3D tasks, e.g., 3D detection, depth estimation, 3D keypoint estimation, still suffer from scarce, expensive annotations and a thin augmentation toolbox, since most image transforms, including resize and rotation, disrupt geometric consistency. In this paper, we introduce 3DRot, a plug-and-play augmentation that rotates and mirrors images about the camera's optical center while synchronously updating RGB images, camera intrinsics, object poses, and 3D annotations to preserve projective geometry-achieving geometry-consistent rotations and reflections without relying on any scene depth. We validate 3DRot with a classical 3D task, monocular 3D detection. On SUN RGB-D dataset, 3DRot raises $IoU_{3D}$ from 43.21 to 44.51, cuts rotation error (ROT) from 22.91$^\circ$ to 20.93$^\circ$, and boosts $mAP_{0.5}$ from 35.70 to 38.11. As a comparison, Cube R-CNN adds 3 other datasets together with SUN RGB-D for monocular 3D estimation, with a similar mechanism and test dataset, increases $IoU_{3D}$ from 36.2 to 37.8, boosts $mAP_{0.5}$ from 34.7 to 35.4. Because it operates purely through camera-space transforms, 3DRot is readily transferable to other 3D tasks.

主题：	计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG); 机器人技术 (cs.RO)
引用方式：	arXiv:2508.01423 [cs.CV]
	(或者 arXiv:2508.01423v2 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.01423

提交历史

来自： Shitian Yang [查看电子邮件]
[v1] 星期六， 2025 年 8 月 2 日 16:08:16 UTC (1,027 KB)
[v2] 星期二， 2025 年 8 月 5 日 11:38:20 UTC (1,031 KB)

计算机科学 > 计算机视觉与模式识别

标题： 3DRot：基于RGB的3D任务的3D旋转增强

标题： 3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： 3DRot：基于RGB的3D任务的3D旋转增强 显示英文标题

标题： 3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： 3DRot：基于RGB的3D任务的3D旋转增强