Privileged Self-Access Matters for Introspection in AI

Song, Siyuan; Lederman, Harvey; Hu, Jennifer; Mahowald, Kyle

计算机科学 > 人工智能

arXiv:2508.14802v1 (cs)

[提交于 2025年8月20日 ]

标题：特权自我访问对人工智能中的内省很重要

标题： Privileged Self-Access Matters for Introspection in AI

Authors:Siyuan Song, Harvey Lederman, Jennifer Hu, Kyle Mahowald

摘要： AI模型能否进行内省是一个日益重要的实际问题。但关于内省如何定义尚无共识。从一个最近提出的“轻量级”定义出发，我们认为应采用更厚重的定义。根据我们的提议，AI中的内省是指任何通过比第三方可用的计算成本相等或更低的过程更可靠的过程，从而获得内部状态信息的过程。通过实验，我们发现当大语言模型对其内部温度参数进行推理时，它们似乎表现出轻量级的内省，但根据我们提出的定义，它们并未真正实现有意义的内省。

摘要： Whether AI models can introspect is an increasingly important practical question. But there is no consensus on how introspection is to be defined. Beginning from a recently proposed ''lightweight'' definition, we argue instead for a thicker one. According to our proposal, introspection in AI is any process which yields information about internal states through a process more reliable than one with equal or lower computational cost available to a third party. Using experiments where LLMs reason about their internal temperature parameters, we show they can appear to have lightweight introspection while failing to meaningfully introspect per our proposed definition.

主题：	人工智能 (cs.AI) ; 计算与语言 (cs.CL)
引用方式：	arXiv:2508.14802 [cs.AI]
	(或者 arXiv:2508.14802v1 [cs.AI] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.14802

提交历史

来自： Siyuan Song [查看电子邮件]
[v1] 星期三， 2025 年 8 月 20 日 15:52:34 UTC (144 KB)

计算机科学 > 人工智能

标题：特权自我访问对人工智能中的内省很重要

标题： Privileged Self-Access Matters for Introspection in AI

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 人工智能

标题： 特权自我访问对人工智能中的内省很重要 显示英文标题

标题： Privileged Self-Access Matters for Introspection in AI

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：特权自我访问对人工智能中的内省很重要