大语言模型(LLM)全栈安全综合调研:数据、训练与部署
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
April 22, 2025
作者: Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Junyuan Mao, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Yi Ding, Donghai Hong, Jiaming Ji, Xinfeng Li, Yifan Jiang, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Yanwei Yue, Wenke Huang, Guancheng Wan, Tianlin Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Tianwei Zhang, Xingjun Ma, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Yuval Elovici, Bhavya Kailkhura, Bo Li, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, Xiaofeng Wang, Shuicheng Yan, Dacheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu
cs.AI
摘要
大型语言模型(LLMs)的显著成功为学术界和工业界实现通用人工智能开辟了一条充满希望的道路,这得益于其在各类应用中的卓越表现。随着LLMs在研究和商业领域日益突出,其安全性和可靠性问题已成为研究人员、企业乃至各国日益关注的焦点。目前,关于LLM安全的现有综述主要聚焦于模型生命周期的特定阶段,如部署阶段或微调阶段,缺乏对LLM整个“生命链”的全面理解。为填补这一空白,本文首次引入“全栈安全”概念,系统性地考量LLM从训练、部署到最终商业化的全过程安全问题。与现有的LLM安全综述相比,我们的工作展现出几大独特优势:(一)全面视角。我们将完整的LLM生命周期定义为涵盖数据准备、预训练、后训练、部署及最终商业化。据我们所知,这是首个覆盖LLM全生命周期的安全综述。(二)广泛的文献支持。我们的研究基于对800多篇论文的详尽回顾,确保了在更全面理解框架下对安全问题的系统梳理与覆盖。(三)独到见解。通过系统的文献分析,我们为每一章节构建了可靠的路线图与视角。我们的工作识别了包括数据生成安全、对齐技术、模型编辑及基于LLM的代理系统安全在内的多个有前景的研究方向,为未来该领域的研究者提供了宝贵的指导。
English
The remarkable success of Large Language Models (LLMs) has illuminated a
promising pathway toward achieving Artificial General Intelligence for both
academic and industrial communities, owing to their unprecedented performance
across various applications. As LLMs continue to gain prominence in both
research and commercial domains, their security and safety implications have
become a growing concern, not only for researchers and corporations but also
for every nation. Currently, existing surveys on LLM safety primarily focus on
specific stages of the LLM lifecycle, e.g., deployment phase or fine-tuning
phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs.
To address this gap, this paper introduces, for the first time, the concept of
"full-stack" safety to systematically consider safety issues throughout the
entire process of LLM training, deployment, and eventual commercialization.
Compared to the off-the-shelf LLM safety surveys, our work demonstrates several
distinctive advantages: (I) Comprehensive Perspective. We define the complete
LLM lifecycle as encompassing data preparation, pre-training, post-training,
deployment and final commercialization. To our knowledge, this represents the
first safety survey to encompass the entire lifecycle of LLMs. (II) Extensive
Literature Support. Our research is grounded in an exhaustive review of over
800+ papers, ensuring comprehensive coverage and systematic organization of
security issues within a more holistic understanding. (III) Unique Insights.
Through systematic literature analysis, we have developed reliable roadmaps and
perspectives for each chapter. Our work identifies promising research
directions, including safety in data generation, alignment techniques, model
editing, and LLM-based agent systems. These insights provide valuable guidance
for researchers pursuing future work in this field.Summary
AI-Generated Summary