SHREC and PHEONA: Using Large Language Models to Advance Next-Generation Computational Phenotyping

Pungitore, Sarah; Yadav, Shashank; Douglas, Molly; Mosier, Jarrod; Subbian, Vignesh

定量生物学 > 定量方法

arXiv:2506.16359v1 (q-bio)

[提交于 2025年6月19日 (此版本) ， 最新版本 2025年7月17日 (v3) ]

标题： SHREC和PHEONA：利用大型语言模型推动下一代计算表型研究的发展

标题： SHREC and PHEONA: Using Large Language Models to Advance Next-Generation Computational Phenotyping

Authors:Sarah Pungitore, Shashank Yadav, Molly Douglas, Jarrod Mosier, Vignesh Subbian

摘要：目的：计算表型是一种重要的信息学活动，其产生的队列支持各种各样的应用。然而，由于手动数据审查、有限的自动化以及在不同来源之间调整算法的困难，这项工作非常耗时。鉴于大型语言模型（LLMs）在文本分类、理解及生成方面展现了有前景的能力，我们认为它们在重复性的人类专家传统手动审查任务中会表现良好。为了支持下一代计算表型方法，我们开发了SHREC框架，用于全面整合LLMs到端到端的表型管道中。材料与方法：我们应用并测试了三种轻量级LLMs（Gemma2 27亿参数、Mistral Small 24亿参数和Phi-4 14亿参数）使用先前开发的急性呼吸衰竭（ARF）呼吸支持疗法的表型来分类概念并表型患者。结果：所有模型在概念分类上表现良好，其中最佳模型（Mistral）在所有相关概念上的AUROC达到了0.896。对于表型任务，模型在所有表型上的特异性接近完美，表现最好的模型（Mistral）在单一疗法表型上的平均AUROC达到了0.853，尽管在多疗法表型上的表现较低。讨论：LLMs有几个优势支持其应用于计算表型，例如仅通过提示工程就能适应新任务的能力，以及能够整合原始电子健康记录（EHR）数据的能力。推进下一代表型方法的未来步骤包括确定整合生物医学数据的最佳策略、探索LLMs如何推理以及推进生成模型方法。结论：当前的轻量级LLMs可以在资源密集型表型任务（如手动数据审查）中可行地协助研究人员。

摘要： Objective: Computational phenotyping is a central informatics activity with resulting cohorts supporting a wide variety of applications. However, it is time-intensive because of manual data review, limited automation, and difficulties in adapting algorithms across sources. Since LLMs have demonstrated promising capabilities for text classification, comprehension, and generation, we posit they will perform well at repetitive manual review tasks traditionally performed by human experts. To support next-generation computational phenotyping methods, we developed SHREC, a framework for comprehensive integration of LLMs into end-to-end phenotyping pipelines. Materials and Methods: We applied and tested the ability of three lightweight LLMs (Gemma2 27 billion, Mistral Small 24 billion, and Phi-4 14 billion) to classify concepts and phenotype patients using previously developed phenotypes for ARF respiratory support therapies. Results: All models performed well on concept classification, with the best model (Mistral) achieving an AUROC of 0.896 across all relevant concepts. For phenotyping, models demonstrated near-perfect specificity for all phenotypes, and the top-performing model (Mistral) reached an average AUROC of 0.853 for single-therapy phenotypes, despite lower performance on multi-therapy phenotypes. Discussion: There are several advantages of LLMs that support their application to computational phenotyping, such as their ability to adapt to new tasks with prompt engineering alone and their ability to incorporate raw EHR data. Future steps to advance next-generation phenotyping methods include determining optimal strategies for integrating biomedical data, exploring how LLMs reason, and advancing generative model methods. Conclusion: Current lightweight LLMs can feasibly assist researchers with resource-intensive phenotyping tasks such as manual data review.

评论：	提交至《美国医学信息学协会杂志》
主题：	定量方法 (q-bio.QM)
引用方式：	arXiv:2506.16359 [q-bio.QM]
	(或者 arXiv:2506.16359v1 [q-bio.QM] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.16359

提交历史

来自： Sarah Pungitore [查看电子邮件]
[v1] 星期四， 2025 年 6 月 19 日 14:35:23 UTC (252 KB)
[v2] 星期六， 2025 年 7 月 5 日 19:01:20 UTC (204 KB)
[v3] 星期四， 2025 年 7 月 17 日 00:41:59 UTC (270 KB)

定量生物学 > 定量方法

标题： SHREC和PHEONA：利用大型语言模型推动下一代计算表型研究的发展

标题： SHREC and PHEONA: Using Large Language Models to Advance Next-Generation Computational Phenotyping

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

定量生物学 > 定量方法

标题： SHREC和PHEONA：利用大型语言模型推动下一代计算表型研究的发展 显示英文标题

标题： SHREC and PHEONA: Using Large Language Models to Advance Next-Generation Computational Phenotyping

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： SHREC和PHEONA：利用大型语言模型推动下一代计算表型研究的发展