大型語言模型(LLM)全棧安全綜合調查:數據、訓練與部署
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
April 22, 2025
作者: Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Junyuan Mao, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Yi Ding, Donghai Hong, Jiaming Ji, Xinfeng Li, Yifan Jiang, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Yanwei Yue, Wenke Huang, Guancheng Wan, Tianlin Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Tianwei Zhang, Xingjun Ma, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Yuval Elovici, Bhavya Kailkhura, Bo Li, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, Xiaofeng Wang, Shuicheng Yan, Dacheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu
cs.AI
摘要
大型語言模型(LLMs)的顯著成功,為學術界和工業界實現人工通用智慧開闢了一條充滿希望的道路,這得益於其在各種應用中前所未有的表現。隨著LLMs在研究和商業領域的持續崛起,其安全性和安全性影響已成為日益關注的焦點,不僅對研究人員和企業如此,對每個國家亦是如此。目前,現有的LLM安全性調查主要集中於LLM生命週期的特定階段,例如部署階段或微調階段,缺乏對LLM整個「生命鏈」的全面理解。為填補這一空白,本文首次引入了「全棧」安全性的概念,以系統性地考慮LLM訓練、部署及最終商業化整個過程中的安全性問題。與現成的LLM安全性調查相比,我們的工作展示了幾個顯著的優勢:(I)全面視角。我們將完整的LLM生命週期定義為涵蓋數據準備、預訓練、後訓練、部署及最終商業化。據我們所知,這是首次涵蓋LLM整個生命週期的安全性調查。(II)廣泛的文獻支持。我們的研究基於對800多篇論文的詳盡回顧,確保在更全面的理解下對安全性問題進行全面覆蓋和系統性組織。(III)獨特見解。通過系統的文獻分析,我們為每一章節開發了可靠的路線圖和視角。我們的工作識別了有前景的研究方向,包括數據生成中的安全性、對齊技術、模型編輯以及基於LLM的代理系統。這些見解為未來在此領域開展研究的研究人員提供了寶貴的指導。
English
The remarkable success of Large Language Models (LLMs) has illuminated a
promising pathway toward achieving Artificial General Intelligence for both
academic and industrial communities, owing to their unprecedented performance
across various applications. As LLMs continue to gain prominence in both
research and commercial domains, their security and safety implications have
become a growing concern, not only for researchers and corporations but also
for every nation. Currently, existing surveys on LLM safety primarily focus on
specific stages of the LLM lifecycle, e.g., deployment phase or fine-tuning
phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs.
To address this gap, this paper introduces, for the first time, the concept of
"full-stack" safety to systematically consider safety issues throughout the
entire process of LLM training, deployment, and eventual commercialization.
Compared to the off-the-shelf LLM safety surveys, our work demonstrates several
distinctive advantages: (I) Comprehensive Perspective. We define the complete
LLM lifecycle as encompassing data preparation, pre-training, post-training,
deployment and final commercialization. To our knowledge, this represents the
first safety survey to encompass the entire lifecycle of LLMs. (II) Extensive
Literature Support. Our research is grounded in an exhaustive review of over
800+ papers, ensuring comprehensive coverage and systematic organization of
security issues within a more holistic understanding. (III) Unique Insights.
Through systematic literature analysis, we have developed reliable roadmaps and
perspectives for each chapter. Our work identifies promising research
directions, including safety in data generation, alignment techniques, model
editing, and LLM-based agent systems. These insights provide valuable guidance
for researchers pursuing future work in this field.Summary
AI-Generated Summary