神經變態
Neural Metamorphosis
October 10, 2024
作者: Xingyi Yang, Xinchao Wang
cs.AI
摘要
本文介紹了一種名為神經變形(Neural Metamorphosis,NeuMeta)的新學習範式,旨在構建自我變形的神經網絡。與為不同架構或大小製作單獨模型不同,NeuMeta直接學習神經網絡的連續權重流形。一旦訓練完成,我們可以直接從流形中對任何大小的網絡採樣權重,甚至對於以前未見過的配置,無需重新訓練。為了實現這一雄心勃勃的目標,NeuMeta訓練神經隱式函數作為超網絡。它們接受模型空間內的坐標作為輸入,並在流形上生成相應的權重值。換句話說,隱式函數是以一種方式學習的,使得預測的權重在各種模型大小上表現良好。在訓練這些模型時,我們注意到,最終性能與學習流形的平滑度密切相關。為了提高這種平滑度,我們採用了兩種策略。首先,我們對權重矩陣進行排列,以實現模型內平滑度,通過解決最短哈密頓路徑問題。此外,在訓練隱式函數時,我們在輸入坐標上添加噪聲,確保具有不同大小的模型顯示一致的輸出。因此,NeuMeta在合成各種網絡配置的參數方面表現出有希望的結果。我們在圖像分類、語義分割和圖像生成方面進行了廣泛測試,結果顯示,即使在75%的壓縮率下,NeuMeta仍能保持全尺寸性能。
English
This paper introduces a new learning paradigm termed Neural Metamorphosis
(NeuMeta), which aims to build self-morphable neural networks. Contrary to
crafting separate models for different architectures or sizes, NeuMeta directly
learns the continuous weight manifold of neural networks. Once trained, we can
sample weights for any-sized network directly from the manifold, even for
previously unseen configurations, without retraining. To achieve this ambitious
goal, NeuMeta trains neural implicit functions as hypernetworks. They accept
coordinates within the model space as input, and generate corresponding weight
values on the manifold. In other words, the implicit function is learned in a
way, that the predicted weights is well-performed across various models sizes.
In training those models, we notice that, the final performance closely relates
on smoothness of the learned manifold. In pursuit of enhancing this smoothness,
we employ two strategies. First, we permute weight matrices to achieve
intra-model smoothness, by solving the Shortest Hamiltonian Path problem.
Besides, we add a noise on the input coordinates when training the implicit
function, ensuring models with various sizes shows consistent outputs. As such,
NeuMeta shows promising results in synthesizing parameters for various network
configurations. Our extensive tests in image classification, semantic
segmentation, and image generation reveal that NeuMeta sustains full-size
performance even at a 75% compression rate.Summary
AI-Generated Summary