透過矩陣核范數對大型語言模型進行評估

摘要

隨著大型語言模型（LLMs）的不斷演進，高效的評估指標對於評估它們壓縮信息和減少冗餘的能力至關重要。儘管傳統指標如矩陣熵提供了有價值的見解，但由於其與奇異值分解（SVD）的\( O(n^3) \)時間複雜度對於大型模型而言計算密集，因此需要采取措施。為了解決這個問題，我們引入了矩陣核范數，它不僅作為一個度量標準，用於量化LLM的數據壓縮能力，還提供了矩陣秩的凸近似，以捕捉預測區分性和多樣性。通過進一步近似核范數的\( L_{1,2}-norm \)，我們可以有效評估模型的信息壓縮能力。這種方法將時間複雜度降低到\( O(n^2) \)，並消除了對SVD計算的需求。因此，矩陣核范數在CEREBRAS-GPT模型的大小從111M增加到6.7B時，實現了比矩陣熵快8到24倍的速度。這種性能差距在更大的模型中變得更加明顯，這在與其他模型如Pythia的測試中得到了驗證。此外，對基準測試和模型響應的評估確認了我們提出的矩陣核范數是一個可靠、可擴展和高效的工具，用於評估LLMs的性能，實現了精確性和計算效率之間的平衡。代碼可在https://github.com/MLGroupJLU/MatrixNuclearNorm找到。

English

As large language models (LLMs) continue to evolve, efficient evaluation metrics are vital for assessing their ability to compress information and reduce redundancy. While traditional metrics like Matrix Entropy offer valuable insights, they are computationally intensive for large-scale models due to their \( O(n^3) \) time complexity with Singular Value Decomposition (SVD). To mitigate this issue, we introduce the Matrix Nuclear-Norm, which not only serves as a metric to quantify the data compression proficiency of LLM but also provides a convex approximation of matrix rank to capture both predictive discriminability and diversity. By employing the \( L_{1,2}-norm \) to further approximate the nuclear norm, we can effectively assess the model's information compression capabilities. This approach reduces the time complexity to \( O(n^2) \) and eliminates the need for SVD computation. Consequently, the Matrix Nuclear-Norm achieves speeds 8 to 24 times faster than Matrix Entropy for the CEREBRAS-GPT model as sizes increase from 111M to 6.7B. This performance gap becomes more pronounced with larger models, as validated in tests with other models like Pythia. Additionally, evaluations on benchmarks and model responses confirm that our proposed Matrix Nuclear-Norm is a reliable, scalable, and efficient tool for assessing LLMs' performance, striking a balance between accuracy and computational efficiency. The code is available at https://github.com/MLGroupJLU/MatrixNuclearNorm.

透過矩陣核范數對大型語言模型進行評估

Large Language Model Evaluation via Matrix Nuclear-Norm

摘要

Summary

Support

Support