Set Pivot Learning: Redefining Generalized Segmentation with Vision Foundation Models

Li, Xinhui; He, Xinyu; Hu, Qiming; Guo, Xiaojie

计算机科学 > 计算机视觉与模式识别

arXiv:2508.01582v1 (cs)

[提交于 2025年8月3日 ]

标题：设置枢轴学习：用视觉基础模型重新定义泛化分割

标题： Set Pivot Learning: Redefining Generalized Segmentation with Vision Foundation Models

Authors:Xinhui Li, Xinyu He, Qiming Hu, Xiaojie Guo

摘要：在本文中，我们首次引入了集合枢轴学习的概念，这是一种范式转变，基于视觉基础模型（VFMs）重新定义了领域泛化（DG）。传统的DG假设在训练期间目标领域是不可访问的，但VFMs的出现，这些模型在大量和多样化数据上进行训练，使这一假设变得模糊且过时。传统的DG假设在训练期间目标领域是不可访问的，但VFMs的出现，这些模型在大量和多样化数据集上进行训练，使这一假设变得模糊且过时。为了解决这一挑战，我们提出了集合枢轴学习（SPL），一种基于VFMs的新领域迁移任务定义，这更符合当前的研究和应用需求。与传统DG方法不同，SPL优先考虑自适应优化而非刚性领域迁移，确保与不断变化的现实条件持续对齐。具体而言，SPL具有两个关键属性：(i) 动态适应，从静态领域对齐过渡到灵活的任务驱动特征优化，使模型能够随着下游场景演变；(ii) 以VFM为中心的调优，利用预训练知识作为枢轴，磨练任务特定表示的同时保持跨领域鲁棒性。基于SPL，我们提出了一种动态提示微调方法，结合了一个动态类感知提示器和一个提示引导特征聚焦器，以提升VFMs在目标场景中的性能。在基准数据集上的大量实验表明了我们方法的有效性，突显了其优于最先进方法的优势，特别是在广义分割方面。

摘要： In this paper, we introduce, for the first time, the concept of Set Pivot Learning, a paradigm shift that redefines domain generalization (DG) based on Vision Foundation Models (VFMs). Traditional DG assumes that the target domain is inaccessible during training, but the emergence of VFMs, trained on vast and diverse data, renders this assumption unclear and obsolete. Traditional DG assumes that the target domain is inaccessible during training, but the emergence of VFMs, which are trained on vast and diverse datasets, renders this assumption unclear and obsolete. To address this challenge, we propose Set Pivot Learning (SPL), a new definition of domain migration task based on VFMs, which is more suitable for current research and application requirements. Unlike conventional DG methods, SPL prioritizes adaptive refinement over rigid domain transfer, ensuring continuous alignment with evolving real-world conditions. Specifically, SPL features two key attributes: (i) Dynamic adaptation, transitioning from static domain alignment to flexible, task-driven feature optimization, enabling models to evolve with downstream scenarios; (ii) VFM-centric tuning, leveraging pretrained knowledge as a pivot to hone task-specific representations while preserving cross-domain robustness. Building on SPL, we propose a Dynamic Prompt Fine-Tuning method, which combines a Dynamic Class-aware Prompter with a Prompt-guided Feature Focuser, to elevate VFM performance in targeted scenarios. Extensive experiments on benchmark datasets show the effectiveness of our method, highlighting its superiority over state-of-the-art methods, particularly in generalized segmentation.

主题：	计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2508.01582 [cs.CV]
	(或者 arXiv:2508.01582v1 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.01582

提交历史

来自： Xinhui Li [查看电子邮件]
[v1] 星期日， 2025 年 8 月 3 日 04:20:35 UTC (8,169 KB)

计算机科学 > 计算机视觉与模式识别

标题：设置枢轴学习：用视觉基础模型重新定义泛化分割

标题： Set Pivot Learning: Redefining Generalized Segmentation with Vision Foundation Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： 设置枢轴学习：用视觉基础模型重新定义泛化分割 显示英文标题

标题： Set Pivot Learning: Redefining Generalized Segmentation with Vision Foundation Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：设置枢轴学习：用视觉基础模型重新定义泛化分割