LIBMoE: 대형 언어 모델에서 Mixture of Experts를 포괄적으로 평가하기 위한 라이브러리

초록

전문가들은 전문가들의 혼합물(MoEs)이 더 효율적이고 효과적인 대형 언어 모델(LLMs)의 개발에 중요한 역할을 한다고 말합니다. 엄청난 자원 요구 사항으로 인해 대규모 MoE 알고리즘의 연구는 많은 연구자들에게 접근하기 어려운 상태입니다. 본 연구는 LibMoE를 개발하여 MoE 알고리즘의 연구, 훈련 및 평가를 간소화하는 포괄적이고 모듈식 프레임워크를 제시합니다. 모듈식 설계, 효율적인 훈련, 포괄적인 평가라는 세 가지 핵심 원칙을 기반으로 한 LibMoE는 훈련 및 평가 파이프라인을 표준화함으로써 다양한 연구자들에게 MoE를 LLMs에 더 쉽게 접근할 수 있도록 합니다. LibMoE를 사용하여, 우리는 세 가지 다른 LLMs와 11개의 데이터셋에서 제로샷 설정 하에 다섯 가지 최첨단 MoE 알고리즘을 철저히 벤치마킹했습니다. 결과는 독특한 특성에도 불구하고, 모든 MoE 알고리즘은 다양한 작업을 평균화했을 때 대체로 유사하게 수행됨을 보여줍니다. 모듈식 설계와 철저한 평가를 통해, 우리는 LibMoE가 연구자들이 의미 있는 진전을 이루기 위해 다음 세대 MoE와 LLMs로 나아가는 데 귀중한 도구가 될 것이라고 믿습니다. 프로젝트 페이지: https://fsoft-aic.github.io/fsoft-LibMoE.github.io.

English

Mixture of Experts (MoEs) plays an important role in the development of more efficient and effective large language models (LLMs). Due to the enormous resource requirements, studying large scale MoE algorithms remain in-accessible to many researchers. This work develops LibMoE, a comprehensive and modular framework to streamline the research, training, and evaluation of MoE algorithms. Built upon three core principles: (i) modular design, (ii) efficient training; (iii) comprehensive evaluation, LibMoE brings MoE in LLMs more accessible to a wide range of researchers by standardizing the training and evaluation pipelines. Using LibMoE, we extensively benchmarked five state-of-the-art MoE algorithms over three different LLMs and 11 datasets under the zero-shot setting. The results show that despite the unique characteristics, all MoE algorithms perform roughly similar when averaged across a wide range of tasks. With the modular design and extensive evaluation, we believe LibMoE will be invaluable for researchers to make meaningful progress towards the next generation of MoE and LLMs. Project page: https://fsoft-aic.github.io/fsoft-LibMoE.github.io.

LIBMoE: 대형 언어 모델에서 Mixture of Experts를 포괄적으로 평가하기 위한 라이브러리

LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

초록

Summary

Support