LLaMo: 대규모 언어 모델 기반 분자 그래프 어시스턴트

초록

대형 언어 모델(LLMs)은 지시 조정을 통해 현저한 일반화 및 지시 준수 능력을 나타냈습니다. LLMs와 지시 조정의 발전은 대형 시각-언어 모델(LVLMs)의 개발로 이어졌습니다. 그러나 LLMs와 지시 조정의 능력은 분자 영역에서 미비하게 탐구되었습니다. 따라서 우리는 LLaMo를 제안합니다: 대형 언어 모델 기반 분자 그래프 어시스턴트로, 이는 단일 분자 그래프-언어 모델로 끝까지 훈련된 대형 모델입니다. 언어와 그래프 모드 간의 불일치를 줄이기 위해 우리는 각 GNN 레이어와 모티프 표현의 출력 표현을 추상화하고 교차-주의 메커니즘을 통해 그래프 표현을 그래프 토큰으로 변환하는 다중 수준 그래프 프로젝터를 제시합니다. 또한 일반 목적의 분자 및 언어 이해를 위해 대형 분자 그래프-언어 모델을 지시 조정하기 위해 기계 생성 분자 그래프 지시 데이터를 소개합니다. 우리의 광범위한 실험은 LLaMo가 분자 설명 생성, 속성 예측 및 IUPAC 이름 예측과 같은 다양한 작업에서 최고의 성능을 보여준다는 것을 입증합니다. LLaMo의 코드는 https://github.com/mlvlab/LLaMo에서 사용할 수 있습니다.

English

Large Language Models (LLMs) have demonstrated remarkable generalization and instruction-following capabilities with instruction tuning. The advancements in LLMs and instruction tuning have led to the development of Large Vision-Language Models (LVLMs). However, the competency of the LLMs and instruction tuning have been less explored in the molecular domain. Thus, we propose LLaMo: Large Language Model-based Molecular graph assistant, which is an end-to-end trained large molecular graph-language model. To bridge the discrepancy between the language and graph modalities, we present the multi-level graph projector that transforms graph representations into graph tokens by abstracting the output representations of each GNN layer and motif representations with the cross-attention mechanism. We also introduce machine-generated molecular graph instruction data to instruction-tune the large molecular graph-language model for general-purpose molecule and language understanding. Our extensive experiments demonstrate that LLaMo shows the best performance on diverse tasks, such as molecular description generation, property prediction, and IUPAC name prediction. The code of LLaMo is available at https://github.com/mlvlab/LLaMo.

LLaMo: 대규모 언어 모델 기반 분자 그래프 어시스턴트

LLaMo: Large Language Model-based Molecular Graph Assistant

초록

Support