MLP-KAN:統一深度表示和函數學習
MLP-KAN: Unifying Deep Representation and Function Learning
October 3, 2024
作者: Yunhong He, Yifeng Xie, Zhengqing Yuan, Lichao Sun
cs.AI
摘要
最近在表示學習和函數學習方面的最新進展展示了在人工智慧各個領域中的巨大潛力。然而,這些範式的有效整合構成了一個重大挑戰,特別是在用戶必須根據數據集特性手動決定應用表示學習還是函數學習模型的情況下。為了解決這個問題,我們引入了 MLP-KAN,這是一種統一的方法,旨在消除手動模型選擇的需要。通過在專家混合模型 (MoE) 結構中集成多層感知器 (MLPs) 進行表示學習和科爾莫哥洛夫-阿諾德網絡 (KANs) 進行函數學習,MLP-KAN 可動態適應當前任務的具體特性,確保最佳性能。嵌入到基於變壓器的框架中,我們的工作在各個領域的四個廣泛使用的數據集上取得了顯著成果。廣泛的實驗評估顯示了其卓越的多功能性,提供了在深度表示學習和函數學習任務中競爭性表現。這些發現突顯了 MLP-KAN 簡化模型選擇過程的潛力,提供了一個全面、適應性的解決方案,適用於各種領域。我們的代碼和權重可在 https://github.com/DLYuanGod/MLP-KAN 上找到。
English
Recent advancements in both representation learning and function learning
have demonstrated substantial promise across diverse domains of artificial
intelligence. However, the effective integration of these paradigms poses a
significant challenge, particularly in cases where users must manually decide
whether to apply a representation learning or function learning model based on
dataset characteristics. To address this issue, we introduce MLP-KAN, a unified
method designed to eliminate the need for manual model selection. By
integrating Multi-Layer Perceptrons (MLPs) for representation learning and
Kolmogorov-Arnold Networks (KANs) for function learning within a
Mixture-of-Experts (MoE) architecture, MLP-KAN dynamically adapts to the
specific characteristics of the task at hand, ensuring optimal performance.
Embedded within a transformer-based framework, our work achieves remarkable
results on four widely-used datasets across diverse domains. Extensive
experimental evaluation demonstrates its superior versatility, delivering
competitive performance across both deep representation and function learning
tasks. These findings highlight the potential of MLP-KAN to simplify the model
selection process, offering a comprehensive, adaptable solution across various
domains. Our code and weights are available at
https://github.com/DLYuanGod/MLP-KAN.Summary
AI-Generated Summary