SlimLM:一种用于设备端文档辅助的高效小型语言模型

SlimLM: An Efficient Small Language Model for On-Device Document Assistance

November 15, 2024
作者: Thang M. Pham, Phat T. Nguyen, Seunghyun Yoon, Viet Dac Lai, Franck Dernoncourt, Trung Bui
cs.AI

摘要

尽管小型语言模型(SLMs)在移动部署方面表现出了潜力,但它们在智能手机上的实际性能和应用仍未得到充分探讨。我们提出了SlimLM,一系列针对移动设备上文档辅助任务进行优化的SLMs。通过在三星Galaxy S24上进行大量实验,我们确定了模型大小(从125M到7B参数不等)、上下文长度和推理时间之间的最佳权衡,以实现高效的设备端处理。SlimLM在SlimPajama-627B上进行了预训练,并在我们构建的用于摘要、问答和建议任务的DocAssist数据集上进行了微调。我们最小的模型在S24上展现了高效的性能,而较大的变体则在移动设备限制内提供了增强的功能。我们对SlimLM进行了评估,与现有的SLMs相比,表现出可比或更优越的性能,并为未来在设备端语言模型研究提供了基准。我们还提供了一个安卓应用程序,为SLM部署提供了实用见解。我们的研究结果提供了宝贵的见解,阐明了在高端智能手机上运行先进语言模型的能力,潜在地降低了服务器成本,并通过设备端处理增强了隐私保护。
English
While small language models (SLMs) show promises for mobile deployment, their real-world performance and applications on smartphones remains underexplored. We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices. Through extensive experiments on a Samsung Galaxy S24, we identify the optimal trade-offs between model size (ranging from 125M to 7B parameters), context length, and inference time for efficient on-device processing. SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist, our constructed dataset for summarization, question answering and suggestion tasks. Our smallest model demonstrates efficient performance on S24, while larger variants offer enhanced capabilities within mobile constraints. We evaluate SlimLM against existing SLMs, showing comparable or superior performance and offering a benchmark for future research in on-device language models. We also provide an Android application, offering practical insights into SLM deployment. Our findings provide valuable insights and illuminate the capabilities of running advanced language models on high-end smartphones, potentially reducing server costs and enhancing privacy through on-device processing.

Summary

AI-Generated Summary

PDF122November 19, 2024