图生成预训练变换器
Graph Generative Pre-trained Transformer
January 2, 2025
作者: Xiaohui Chen, Yinkai Wang, Jiaxing He, Yuanqi Du, Soha Hassoun, Xiaolin Xu, Li-Ping Liu
cs.AI
摘要
图生成是许多领域中的关键任务,包括分子设计和社交网络分析,因为它能够模拟复杂关系和结构化数据。虽然大多数现代图生成模型使用邻接矩阵表示,但本研究重新审视了一种将图表示为节点集合和边集合序列的替代方法。我们支持这种方法,因为它能够高效地对图进行编码,并提出了一种新颖的表示方法。基于这种表示,我们引入了图生成预训练变换器(G2PT),这是一种自回归模型,通过下一个标记预测来学习图结构。为了进一步利用G2PT作为通用基础模型的能力,我们探索了两个下游应用的微调策略:面向目标的生成和图属性预测。我们在多个数据集上进行了广泛实验。结果表明,G2PT在通用图和分子数据集上均实现了优越的生成性能。此外,G2PT在从分子设计到属性预测等下游任务中表现出强大的适应性和多功能性。
English
Graph generation is a critical task in numerous domains, including molecular
design and social network analysis, due to its ability to model complex
relationships and structured data. While most modern graph generative models
utilize adjacency matrix representations, this work revisits an alternative
approach that represents graphs as sequences of node set and edge set. We
advocate for this approach due to its efficient encoding of graphs and propose
a novel representation. Based on this representation, we introduce the Graph
Generative Pre-trained Transformer (G2PT), an auto-regressive model that learns
graph structures via next-token prediction. To further exploit G2PT's
capabilities as a general-purpose foundation model, we explore fine-tuning
strategies for two downstream applications: goal-oriented generation and graph
property prediction. We conduct extensive experiments across multiple datasets.
Results indicate that G2PT achieves superior generative performance on both
generic graph and molecule datasets. Furthermore, G2PT exhibits strong
adaptability and versatility in downstream tasks from molecular design to
property prediction.Summary
AI-Generated Summary