Xmodel-1.5:一个规模为10亿的多语言LLM

Xmodel-1.5: An 1B-scale Multilingual LLM

November 15, 2024
作者: Wang Qun, Liu Yang, Lin Qingquan, Jiang Ling
cs.AI

摘要

我们介绍了Xmodel-1.5,这是一个新颖的10亿参数的多语言大型模型,预训练了大约2万亿个标记。该模型在多种语言中表现出色,尤其在泰语、阿拉伯语和法语方面表现突出,同时在中文和英文中也表现有效。此外,我们通过发布一个泰语评估数据集,为研究社区做出了贡献,该数据集包括由占集大学综合创新学院学生注释的数百个问题。尽管结果令人鼓舞,我们承认仍有改进的空间。我们希望这项工作推动多语言人工智能研究的持续努力,并促进各种自然语言处理任务中更好的跨语言理解。我们的模型和代码已公开在GitHub上发布,网址为https://github.com/XiaoduoAILab/XmodelLM。
English
We introduce Xmodel-1.5, a novel 1-billion-parameter multilingual large model pretrained on approximately 2 trillion tokens. The model demonstrates strong performance across several languages, with particularly notable results in Thai, Arabic, and French, alongside its effectiveness in Chinese and English. In addition, we contribute to the research community by releasing a Thai evaluation dataset, which includes hundreds of questions annotated by students from Chulalongkorn University's School of Integrated Innovation. While the results are promising, we acknowledge that there is still room for improvement. We hope this work advances ongoing efforts in multilingual AI research and promotes better cross-linguistic understanding in various natural language processing tasks. Our models and code are publicly available on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

Summary

AI-Generated Summary

PDF142November 18, 2024