Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

Ramanujam, Sudarshan Srinivasa; Alonso, Antonio; Kataria, Saurabh; Dangi, Siddharth; Gupta, Akhilesh; Tiwana, Birjodh Singh; Somaiya, Manas; Simon, Luke; Byrne, David; Ha, Sojeong; Zhou, Sen; Akterskii, Andrei; Liu, Zhanglong; Sriram, Samira; Xiong, Crescent; Pei, Zhoutao; Shao, Angela; Li, Alex; Xiao, Annie; Kolb, Caitlin; Kistler, Thomas; Moore, Zach; Firooz, Hamed

计算机科学 > 信息检索

arXiv:2510.14223 (cs)

[提交于 2025年10月16日 ]

标题：使用因果语言模型进行领英动态的大规模检索

标题： Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

摘要：在像LinkedIn动态这样的大规模推荐系统中，检索阶段对于将数亿个潜在候选者缩小到可管理的子集以进行排序至关重要。LinkedIn的动态从成员网络之外提供基于成员主题兴趣的建议内容，其中在几毫秒的延迟预算和每秒数千次的入站QPS下，从数亿个候选者中检索出2000个候选者。本文介绍了一种新颖的检索方法，该方法微调了一个大型因果语言模型（Meta的LLaMA 3）作为双编码器，仅使用文本输入为用户（成员）和内容（项目）生成高质量的嵌入。我们描述了端到端的流程，包括嵌入生成的提示设计、在LinkedIn规模上的微调技术以及低延迟、成本效益高的在线服务基础设施。我们分享了关于如何在提示中量化数值特征以使信息在嵌入中正确编码的发现，从而促进检索层和排序层之间的更好对齐。该系统通过离线指标和在线A/B测试进行了评估，结果显示了成员参与度的显著提升。我们观察到新成员有显著的提升，他们通常缺乏强大的网络连接，这表明高质量的建议内容有助于留存。这项工作展示了生成式语言模型如何在工业应用中有效适应实时、高吞吐量的检索。

摘要： In large scale recommendation systems like the LinkedIn Feed, the retrieval stage is critical for narrowing hundreds of millions of potential candidates to a manageable subset for ranking. LinkedIn's Feed serves suggested content from outside of the member's network (based on the member's topical interests), where 2000 candidates are retrieved from a pool of hundreds of millions candidate with a latency budget of a few milliseconds and inbound QPS of several thousand per second. This paper presents a novel retrieval approach that fine-tunes a large causal language model (Meta's LLaMA 3) as a dual encoder to generate high quality embeddings for both users (members) and content (items), using only textual input. We describe the end to end pipeline, including prompt design for embedding generation, techniques for fine-tuning at LinkedIn's scale, and infrastructure for low latency, cost effective online serving. We share our findings on how quantizing numerical features in the prompt enables the information to get properly encoded in the embedding, facilitating greater alignment between the retrieval and ranking layer. The system was evaluated using offline metrics and an online A/B test, which showed substantial improvements in member engagement. We observed significant gains among newer members, who often lack strong network connections, indicating that high-quality suggested content aids retention. This work demonstrates how generative language models can be effectively adapted for real time, high throughput retrieval in industrial applications.

评论：	9页，4图
主题：	信息检索 (cs.IR) ; 人工智能 (cs.AI)
引用方式：	arXiv:2510.14223 [cs.IR]
	(或者 arXiv:2510.14223v1 [cs.IR] 对于此版本)
	https://doi.org/10.48550/arXiv.2510.14223

提交历史

来自： Sudarshan Srinivasa Ramanujam [查看电子邮件]
[v1] 星期四， 2025 年 10 月 16 日 02:01:33 UTC (141 KB)

计算机科学 > 信息检索

标题：使用因果语言模型进行领英动态的大规模检索

标题： Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 信息检索

标题： 使用因果语言模型进行领英动态的大规模检索 显示英文标题

标题： Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：使用因果语言模型进行领英动态的大规模检索