SlimLM:一個適用於設備上文件輔助的高效小型語言模型

SlimLM: An Efficient Small Language Model for On-Device Document Assistance

November 15, 2024
作者: Thang M. Pham, Phat T. Nguyen, Seunghyun Yoon, Viet Dac Lai, Franck Dernoncourt, Trung Bui
cs.AI

摘要

雖然小型語言模型(SLMs)在移動部署方面表現出潛力,但它們在智慧型手機上的實際性能和應用仍未被充分探索。我們提出了SlimLM,這是一系列針對移動設備上文件輔助任務進行優化的SLMs。通過在三星Galaxy S24上進行大量實驗,我們確定了模型大小(從125M到7B參數不等)、上下文長度和推理時間之間的最佳折衷方案,以實現高效的設備內處理。SlimLM在SlimPajama-627B上進行了預訓練,並在我們構建的DocAssist數據集上進行了微調,用於摘要、問答和建議任務。我們最小的模型在S24上展現了高效的性能,而較大的變體則在移動設備的限制範圍內提供了增強的功能。我們對現有的SLMs進行了評估,展示了可比或優越的性能,並為未來在設備上語言模型研究提供了基準。我們還提供了一個Android應用程序,提供了有關SLM部署的實用見解。我們的研究結果提供了寶貴的見解,闡明了在高端智慧型手機上運行先進語言模型的能力,潛在地降低了伺服器成本,並通過設備內處理增強了隱私保護。
English
While small language models (SLMs) show promises for mobile deployment, their real-world performance and applications on smartphones remains underexplored. We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices. Through extensive experiments on a Samsung Galaxy S24, we identify the optimal trade-offs between model size (ranging from 125M to 7B parameters), context length, and inference time for efficient on-device processing. SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist, our constructed dataset for summarization, question answering and suggestion tasks. Our smallest model demonstrates efficient performance on S24, while larger variants offer enhanced capabilities within mobile constraints. We evaluate SlimLM against existing SLMs, showing comparable or superior performance and offering a benchmark for future research in on-device language models. We also provide an Android application, offering practical insights into SLM deployment. Our findings provide valuable insights and illuminate the capabilities of running advanced language models on high-end smartphones, potentially reducing server costs and enhancing privacy through on-device processing.

Summary

AI-Generated Summary

PDF122November 19, 2024