THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures

Kanani, Alish; Pfromm, Lukas; Sharma, Harsh; Doppa, Janardhan Rao; Pande, Partha Pratim; Ogras, Umit Y.

计算机科学 > 硬件架构

arXiv:2508.10691v1 (cs)

[提交于 2025年8月14日 ]

标题： THERMOS：在异构多芯片封装PIM架构上面向热感知的AI工作负载多目标调度

标题： THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures

Authors:Alish Kanani, Lukas Pfromm, Harsh Sharma, Janardhan Rao Doppa, Partha Pratim Pande, Umit Y. Ogras

摘要：基于芯粒的集成技术使能够结合多种技术的大规模系统成为可能，从而实现更高的良率、更低的成本和可扩展性，使其非常适合人工智能工作负载。存储内处理（PIM）已成为人工智能推理的有前景的解决方案，利用了如ReRAM、SRAM和FeFET等技术，每种技术都有其独特的优点和权衡。异构芯粒化PIM架构可以利用这些技术的互补优势，以实现更高的性能和能效。然而，由于竞争性的性能目标、动态的工作负载特性以及功耗和热力限制，这种异构系统上的AI工作负载调度具有挑战性。为了解决这一需求，我们提出了THERMOS，这是一个热感知的多目标调度框架，用于异构多芯粒PIM架构上的AI工作负载。THERMOS训练一个单一的多目标强化学习（MORL）策略，能够在运行时根据目标偏好实现帕累托最优的执行时间、能耗或平衡目标。全面评估显示，与基线AI工作负载调度算法相比，THERMOS在仅增加0.14%运行时间和0.022%能耗的情况下，实现了平均执行时间最多快89%和平均能耗降低57%。

摘要： Chiplet-based integration enables large-scale systems that combine diverse technologies, enabling higher yield, lower costs, and scalability, making them well-suited to AI workloads. Processing-in-Memory (PIM) has emerged as a promising solution for AI inference, leveraging technologies such as ReRAM, SRAM, and FeFET, each offering unique advantages and trade-offs. A heterogeneous chiplet-based PIM architecture can harness the complementary strengths of these technologies to enable higher performance and energy efficiency. However, scheduling AI workloads across such a heterogeneous system is challenging due to competing performance objectives, dynamic workload characteristics, and power and thermal constraints. To address this need, we propose THERMOS, a thermally-aware, multi-objective scheduling framework for AI workloads on heterogeneous multi-chiplet PIM architectures. THERMOS trains a single multi-objective reinforcement learning (MORL) policy that is capable of achieving Pareto-optimal execution time, energy, or a balanced objective at runtime, depending on the target preferences. Comprehensive evaluations show that THERMOS achieves up to 89% faster average execution time and 57% lower average energy consumption than baseline AI workload scheduling algorithms with only 0.14% runtime and 0.022% energy overhead.

评论：	论文被ESWEEK 2025（CODES+ISSS）会议接收
主题：	硬件架构 (cs.AR)
引用方式：	arXiv:2508.10691 [cs.AR]
	(或者 arXiv:2508.10691v1 [cs.AR] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.10691

提交历史

来自： Alish Kanani [查看电子邮件]
[v1] 星期四， 2025 年 8 月 14 日 14:35:54 UTC (6,783 KB)

计算机科学 > 硬件架构

标题： THERMOS：在异构多芯片封装PIM架构上面向热感知的AI工作负载多目标调度

标题： THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 硬件架构

标题： THERMOS：在异构多芯片封装PIM架构上面向热感知的AI工作负载多目标调度 显示英文标题

标题： THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： THERMOS：在异构多芯片封装PIM架构上面向热感知的AI工作负载多目标调度