모델 저하 없이 평생 순차적 지식 편집

초록

이전에 매개변수 수정 지식 편집에 대한 이전 연구에서는 대규모 순차 편집이 모델의 심각한 저하로 이어진다는 것을 보여주었습니다. 본 논문에서는 이에 대한 이유를 연구하고 순차적인 지식 편집을 10,000번까지 확장하면서 원래 모델의 하류 성능을 유지합니다. 먼저, 편집된 사실에 대한 지역화 후 편집 지식 편집 방법이 편집된 특정 사실에 대한 과적합을 유발한다는 것을 보여줍니다. 또한 이러한 방법을 사용한 연속적인 지식 편집이 편집된 행렬의 노름이 불균형하게 증가하게 됨을 보여줍니다. 그런 다음, 지역화 후 편집 방법의 내부 작동에 대한 중요한 통찰력을 제공합니다. 우리는 이러한 방법들이 사용하는 노름 증가가 편집된 레이어에서 생성된 출력 활성화에 더 큰 중요성을 부여하는 숨겨진 속임수임을 보여줍니다. 이 "중요성 해킹"을 통해, 편집된 레이어는 모델의 출력에 훨씬 더 큰 기여를 제공합니다. 이러한 문제를 완화하기 위해 우리는 ENCORE - 조기 중지 및 노름 제한 강건한 지식 편집을 제시합니다. ENCORE는 과적합 및 불균형한 노름 증가를 제어하여 하류 성능 손실 없이 장기적인 순차 편집을 가능하게 합니다. ENCORE는 Llama3-8B에서 MEMIT보다 61% 빠르고 AlphaEdit보다 64% 빠릅니다.

English

Prior work in parameter-modifying knowledge editing has shown that large-scale sequential editing leads to significant model degradation. In this paper, we study the reasons behind this and scale sequential knowledge editing to 10,000 sequential edits, while maintaining the downstream performance of the original model. We first show that locate-then-edit knowledge editing methods lead to overfitting on the edited facts. We also show that continuous knowledge editing using these methods leads to disproportionate growth in the norm of the edited matrix. We then provide a crucial insight into the inner workings of locate-then-edit methods. We show that norm-growth is a hidden trick employed by these methods that gives larger importance to the output activations produced from the edited layers. With this "importance hacking", the edited layers provide a much larger contributions to the model's output. To mitigate these issues, we present ENCORE - Early stopping and Norm-Constrained Robust knowledge Editing. ENCORE controls for overfitting and the disproportionate norm-growth to enable long-term sequential editing, where we are able to perform up to 10,000 sequential edits without loss of downstream performance. ENCORE is also 61% faster than MEMIT and 64% faster than AlphaEdit on Llama3-8B.

모델 저하 없이 평생 순차적 지식 편집

Lifelong Sequential Knowledge Editing without Model Degradation

초록

Support