Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework

Qu, Chongyu; Luna, Allen J.; Li, Thomas Z.; Zhu, Junchao; Guo, Junlin; Xiong, Juming; Sandler, Kim L.; Landman, Bennett A.; Huo, Yuankai

计算机科学 > 机器学习

arXiv:2508.14940 (cs)

[提交于 2025年8月20日 (v1) ，最后修订 2025年8月26日 (此版本， v2)]

标题：基于检索增强模型选择框架的队列感知代理用于个性化肺癌风险预测

标题： Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework

Authors:Chongyu Qu, Allen J. Luna, Thomas Z. Li, Junchao Zhu, Junlin Guo, Juming Xiong, Kim L. Sandler, Bennett A. Landman, Yuankai Huo

摘要：准确的肺癌风险预测仍然具有挑战性，因为患者群体和临床环境之间存在显著的变异性——没有一个模型在所有队列中表现最佳。为了解决这个问题，我们提出了一个个性化的肺癌风险预测代理，通过结合队列特定的知识与现代检索和推理技术，动态地为每位患者选择最合适的模型。给定患者的CT扫描和结构化元数据——包括人口统计、临床和结节级特征——代理首先使用基于FAISS的相似性搜索，在九个多样化的现实队列中识别来自多机构数据库中最相关的患者群体。其次，将检索到的队列及其相关性能指标提示给大型语言模型（LLM），以从八个代表性模型中推荐最优的预测算法，包括经典的线性风险模型（例如，Mayo，Brock）、时间感知模型（例如，TD-VIT，DLSTM）以及多模态计算机视觉方法（例如，Liao，Sybil，DLS，DLI）。这个两阶段的代理流程——通过FAISS进行检索，通过LLM进行推理——实现了动态的、队列感知的风险预测，可根据每位患者的情况进行个性化。在此架构的基础上，该代理支持在多样化临床人群中灵活且以队列为导向的模型选择，为现实世界中的肺癌筛查提供个性化的风险评估实践路径。

摘要： Accurate lung cancer risk prediction remains challenging due to substantial variability across patient populations and clinical settings -- no single model performs best for all cohorts. To address this, we propose a personalized lung cancer risk prediction agent that dynamically selects the most appropriate model for each patient by combining cohort-specific knowledge with modern retrieval and reasoning techniques. Given a patient's CT scan and structured metadata -- including demographic, clinical, and nodule-level features -- the agent first performs cohort retrieval using FAISS-based similarity search across nine diverse real-world cohorts to identify the most relevant patient population from a multi-institutional database. Second, a Large Language Model (LLM) is prompted with the retrieved cohort and its associated performance metrics to recommend the optimal prediction algorithm from a pool of eight representative models, including classical linear risk models (e.g., Mayo, Brock), temporally-aware models (e.g., TD-VIT, DLSTM), and multi-modal computer vision-based approaches (e.g., Liao, Sybil, DLS, DLI). This two-stage agent pipeline -- retrieval via FAISS and reasoning via LLM -- enables dynamic, cohort-aware risk prediction personalized to each patient's profile. Building on this architecture, the agent supports flexible and cohort-driven model selection across diverse clinical populations, offering a practical path toward individualized risk assessment in real-world lung cancer screening.

主题：	机器学习 (cs.LG)
引用方式：	arXiv:2508.14940 [cs.LG]
	(或者 arXiv:2508.14940v2 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.14940

提交历史

来自： Chongyu Qu [查看电子邮件]
[v1] 星期三， 2025 年 8 月 20 日 02:59:39 UTC (1,289 KB)
[v2] 星期二， 2025 年 8 月 26 日 17:59:53 UTC (1,289 KB)

计算机科学 > 机器学习

标题：基于检索增强模型选择框架的队列感知代理用于个性化肺癌风险预测

标题： Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 基于检索增强模型选择框架的队列感知代理用于个性化肺癌风险预测 显示英文标题

标题： Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：基于检索增强模型选择框架的队列感知代理用于个性化肺癌风险预测