The Ethics Engine: A Modular Pipeline for Accessible Psychometric Assessment of Large Language Models

Van Clief, Jake; Kyritsopoulos, Constantine

计算机科学 > 计算机与社会

arXiv:2510.11742 (cs)

[提交于 2025年10月11日 ]

标题：伦理引擎：一种可访问的大语言模型心理测量评估的模块化流程

标题： The Ethics Engine: A Modular Pipeline for Accessible Psychometric Assessment of Large Language Models

Authors:Jake Van Clief, Constantine Kyritsopoulos

摘要：随着大型语言模型在人类交流和决策中发挥越来越重要的中介作用，理解它们的价值表达对于跨学科研究变得至关重要。这项工作介绍了伦理引擎，一个模块化的Python流程，它将对语言模型的心理测量评估从技术复杂的任务转变为易于使用的研究工具。该流程展示了精心设计的基础设施如何扩大AI研究的参与度，使认知科学、政治心理学、教育学和其他领域的研究者能够研究语言模型中的价值表达。爱丁堡大学的研究人员最近采用该工具来研究权威主义，这验证了其研究价值，已处理超过10,000个AI响应，涵盖多个模型和情境。我们认为，这类工具通过降低技术门槛而同时保持科学严谨性，从根本上改变了AI研究的格局。随着语言模型日益成为认知基础设施，其嵌入的价值观塑造着数百万日常互动。在没有系统测量这些价值表达的情况下，我们部署的系统其道德影响仍处于未知领域。伦理引擎使对这些有影响力技术的知情治理成为可能，提供了必要的严格评估。

摘要： As Large Language Models increasingly mediate human communication and decision-making, understanding their value expression becomes critical for research across disciplines. This work presents the Ethics Engine, a modular Python pipeline that transforms psychometric assessment of LLMs from a technically complex endeavor into an accessible research tool. The pipeline demonstrates how thoughtful infrastructure design can expand participation in AI research, enabling investigators across cognitive science, political psychology, education, and other fields to study value expression in language models. Recent adoption by University of Edinburgh researchers studying authoritarianism validates its research utility, processing over 10,000 AI responses across multiple models and contexts. We argue that such tools fundamentally change the landscape of AI research by lowering technical barriers while maintaining scientific rigor. As LLMs increasingly serve as cognitive infrastructure, their embedded values shape millions of daily interactions. Without systematic measurement of these value expressions, we deploy systems whose moral influence remains uncharted. The Ethics Engine enables the rigorous assessment necessary for informed governance of these influential technologies.

评论：	18页，2个图表。代码可在 https://github.com/RinDig/GPTmetrics 获取
主题：	计算机与社会 (cs.CY)
MSC 类：	68T50, 91E99
ACM 类：	K.4.1; I.2.7; J.4
引用方式：	arXiv:2510.11742 [cs.CY]
	(或者 arXiv:2510.11742v1 [cs.CY] 对于此版本)
	https://doi.org/10.48550/arXiv.2510.11742

提交历史

来自： Jake Van Clief [查看电子邮件]
[v1] 星期六， 2025 年 10 月 11 日 00:09:51 UTC (524 KB)

计算机科学 > 计算机与社会

标题：伦理引擎：一种可访问的大语言模型心理测量评估的模块化流程

标题： The Ethics Engine: A Modular Pipeline for Accessible Psychometric Assessment of Large Language Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机与社会

标题： 伦理引擎：一种可访问的大语言模型心理测量评估的模块化流程 显示英文标题

标题： The Ethics Engine: A Modular Pipeline for Accessible Psychometric Assessment of Large Language Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：伦理引擎：一种可访问的大语言模型心理测量评估的模块化流程