ChatPaper.aiChatPaper

MolSpectra:基于多模态能谱的3D分子表征预训练

MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra

February 22, 2025
作者: Liang Wang, Shaozhen Liu, Yu Rong, Deli Zhao, Qiang Liu, Shu Wu, Liang Wang
cs.AI

摘要

建立三维结构与分子系统能量状态之间的关系已被证明是学习三维分子表征的一种有前景的方法。然而,现有方法仅限于从经典力学角度建模分子能量状态。这一限制导致了对量子力学效应的重大忽视,例如量子化(离散)能级结构,这些效应能更精确地估算分子能量,并可通过能谱实验测量。本文提出利用能谱来增强三维分子表征(MolSpectra)的预训练,从而将量子力学知识融入分子表征中。具体而言,我们提出了SpecFormer,一种通过掩码补丁重建来编码分子能谱的多谱编码器。通过进一步使用对比目标对齐三维编码器和能谱编码器的输出,我们增强了三维编码器对分子的理解。在公开基准上的评估表明,我们的预训练表征在预测分子特性和建模动力学方面超越了现有方法。
English
Establishing the relationship between 3D structures and the energy states of molecular systems has proven to be a promising approach for learning 3D molecular representations. However, existing methods are limited to modeling the molecular energy states from classical mechanics. This limitation results in a significant oversight of quantum mechanical effects, such as quantized (discrete) energy level structures, which offer a more accurate estimation of molecular energy and can be experimentally measured through energy spectra. In this paper, we propose to utilize the energy spectra to enhance the pre-training of 3D molecular representations (MolSpectra), thereby infusing the knowledge of quantum mechanics into the molecular representations. Specifically, we propose SpecFormer, a multi-spectrum encoder for encoding molecular spectra via masked patch reconstruction. By further aligning outputs from the 3D encoder and spectrum encoder using a contrastive objective, we enhance the 3D encoder's understanding of molecules. Evaluations on public benchmarks reveal that our pre-trained representations surpass existing methods in predicting molecular properties and modeling dynamics.

Summary

AI-Generated Summary

PDF52February 27, 2025