混合CPU性能优化的动态并行方法
A dynamic parallel method for performance optimization on hybrid CPUs
November 29, 2024
作者: Luo Yu, Liu Yucheng, Shen Haihao
cs.AI
摘要
AIPC概念越来越受欢迎,越来越多的混合CPU将在客户设备上运行AI模型。然而,当前的AI推理框架忽视了混合CPU的不平衡硬件能力,导致推理性能较低。为解决这一问题,我们引入了一种针对混合CPU的动态并行方法,通过在并行工作开始之前平衡每个混合CPU核心的工作负载,显著提高了LLM推理性能。这种方法使得神经速度在两个混合英特尔CPU上实现了超过90%(平均值)的内存带宽利用率。
English
The AIPC concept is gaining popularity, and more and more hybrid CPUs will be
running AI models on client devices. However, the current AI inference
framework overlooks the imbalanced hardware capability of hybrid CPUs, leading
to low inference performance. To address this issue, we have introduced a
dynamic parallel method for hybrid CPUs, which significantly increases LLM
inference performance by balancing the workload for each core of a hybrid CPU
before the parallel work starts. This method has enabled Neural Speed to
achieve more than 90% (on average) of memory bandwidth on two hybrid Intel
CPUs.Summary
AI-Generated Summary