混合 CPU 上的性能優化動態並行方法

A dynamic parallel method for performance optimization on hybrid CPUs

November 29, 2024
作者: Luo Yu, Liu Yucheng, Shen Haihao
cs.AI

摘要

AIPC概念正日益普及,越來越多的混合CPU將在客戶設備上運行AI模型。然而,目前的AI推論框架忽略了混合CPU不平衡的硬體能力,導致推論性能不佳。為解決此問題,我們引入了一種動態並行方法用於混合CPU,通過在並行工作開始之前平衡每個核心的工作量,顯著提高了LLM推論性能。這種方法使得Neural Speed在兩個混合Intel CPU上實現了超過90%(平均值)的記憶體帶寬利用率。
English
The AIPC concept is gaining popularity, and more and more hybrid CPUs will be running AI models on client devices. However, the current AI inference framework overlooks the imbalanced hardware capability of hybrid CPUs, leading to low inference performance. To address this issue, we have introduced a dynamic parallel method for hybrid CPUs, which significantly increases LLM inference performance by balancing the workload for each core of a hybrid CPU before the parallel work starts. This method has enabled Neural Speed to achieve more than 90% (on average) of memory bandwidth on two hybrid Intel CPUs.

Summary

AI-Generated Summary

PDF52December 4, 2024