圖生成預訓練Transformer
Graph Generative Pre-trained Transformer
January 2, 2025
作者: Xiaohui Chen, Yinkai Wang, Jiaxing He, Yuanqi Du, Soha Hassoun, Xiaolin Xu, Li-Ping Liu
cs.AI
摘要
圖形生成在多個領域中是一項關鍵任務,包括分子設計和社交網絡分析,因為它能夠模擬複雜關係和結構化數據。儘管大多數現代圖形生成模型使用鄰接矩陣表示,本研究重新審視了一種將圖形表示為節點集和邊集序列的替代方法。我們支持這種方法,因為它對圖形的高效編碼,並提出了一種新穎的表示方法。基於這種表示,我們引入了圖形生成預訓練Transformer(G2PT),這是一個通過下一個標記預測來學習圖形結構的自回歸模型。為了進一步利用G2PT作為通用基礎模型的能力,我們探索了兩個下游應用的微調策略:目標導向生成和圖形屬性預測。我們在多個數據集上進行了廣泛的實驗。結果表明,G2PT在通用圖形和分子數據集上均實現了優越的生成性能。此外,G2PT在從分子設計到屬性預測等下游任務中展現出強大的適應性和多功能性。
English
Graph generation is a critical task in numerous domains, including molecular
design and social network analysis, due to its ability to model complex
relationships and structured data. While most modern graph generative models
utilize adjacency matrix representations, this work revisits an alternative
approach that represents graphs as sequences of node set and edge set. We
advocate for this approach due to its efficient encoding of graphs and propose
a novel representation. Based on this representation, we introduce the Graph
Generative Pre-trained Transformer (G2PT), an auto-regressive model that learns
graph structures via next-token prediction. To further exploit G2PT's
capabilities as a general-purpose foundation model, we explore fine-tuning
strategies for two downstream applications: goal-oriented generation and graph
property prediction. We conduct extensive experiments across multiple datasets.
Results indicate that G2PT achieves superior generative performance on both
generic graph and molecule datasets. Furthermore, G2PT exhibits strong
adaptability and versatility in downstream tasks from molecular design to
property prediction.Summary
AI-Generated Summary