RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems

Chen, Luyu; Dai, Quanyu; Zhang, Zeyu; Feng, Xueyang; Zhang, Mingyu; Tang, Pengcheng; Chen, Xu; Zhu, Yue; Dong, Zhenhua

doi:10.1145/3701716.3715258

计算机科学 > 人机交互

arXiv:2507.22897 (cs)

[提交于 2025年6月25日 ]

标题： RecUserSim：一种用于评估对话推荐系统的现实且多样的用户模拟器

标题： RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems

Authors:Luyu Chen, Quanyu Dai, Zeyu Zhang, Xueyang Feng, Mingyu Zhang, Pengcheng Tang, Xu Chen, Yue Zhu, Zhenhua Dong

摘要：对话推荐系统（CRS）通过多轮交互增强用户体验，但评估CRS仍然具有挑战性。用户模拟器可以通过与CRS的交互提供全面的评估，但构建真实且多样的模拟器却很困难。尽管最近的工作利用大语言模型（LLMs）来模拟用户交互，但它们在跨不同场景模拟真实个体用户方面仍存在不足，并且缺乏明确的评分机制用于定量评估。为解决这些差距，我们提出了RecUserSim，这是一个基于LLM代理的用户模拟器，在增强模拟的真实性和多样性的同时提供明确的评分。 RecUserSim具有几个关键模块：一个用于定义真实且多样的用户角色的配置文件模块，一个用于跟踪交互历史并发现未知偏好的记忆模块，以及一个受有限理性理论启发的核心动作模块，该模块能够在生成更细粒度的动作和个性化响应的同时实现细微的决策。为了进一步增强输出控制，设计了一个精炼模块来微调最终响应。实验表明，RecUserSim生成多样且可控的输出，并产生真实高质量的对话，即使使用较小的基础LLM也是如此。 RecUserSim生成的评分在不同的基础LLM之间表现出高度一致性，突显了其在CRS评估中的有效性。

摘要： Conversational recommender systems (CRS) enhance user experience through multi-turn interactions, yet evaluating CRS remains challenging. User simulators can provide comprehensive evaluations through interactions with CRS, but building realistic and diverse simulators is difficult. While recent work leverages large language models (LLMs) to simulate user interactions, they still fall short in emulating individual real users across diverse scenarios and lack explicit rating mechanisms for quantitative evaluation. To address these gaps, we propose RecUserSim, an LLM agent-based user simulator with enhanced simulation realism and diversity while providing explicit scores. RecUserSim features several key modules: a profile module for defining realistic and diverse user personas, a memory module for tracking interaction history and discovering unknown preferences, and a core action module inspired by Bounded Rationality theory that enables nuanced decision-making while generating more fine-grained actions and personalized responses. To further enhance output control, a refinement module is designed to fine-tune final responses. Experiments demonstrate that RecUserSim generates diverse, controllable outputs and produces realistic, high-quality dialogues, even with smaller base LLMs. The ratings generated by RecUserSim show high consistency across different base LLMs, highlighting its effectiveness for CRS evaluation.

评论：	被TheWebConf'25行业赛道接受
主题：	人机交互 (cs.HC) ; 人工智能 (cs.AI)
引用方式：	arXiv:2507.22897 [cs.HC]
	(或者 arXiv:2507.22897v1 [cs.HC] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.22897
相关 DOI:	https://doi.org/10.1145/3701716.3715258

提交历史

来自： Luyu Chen [查看电子邮件]
[v1] 星期三， 2025 年 6 月 25 日 08:42:46 UTC (4,301 KB)

计算机科学 > 人机交互

标题： RecUserSim：一种用于评估对话推荐系统的现实且多样的用户模拟器

标题： RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 人机交互

标题： RecUserSim：一种用于评估对话推荐系统的现实且多样的用户模拟器 显示英文标题

标题： RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： RecUserSim：一种用于评估对话推荐系统的现实且多样的用户模拟器