開放材料2024(OMat24)無機材料數據集與模型
Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models
October 16, 2024
作者: Luis Barroso-Luque, Muhammed Shuaibi, Xiang Fu, Brandon M. Wood, Misko Dzamba, Meng Gao, Ammar Rizvi, C. Lawrence Zitnick, Zachary W. Ulissi
cs.AI
摘要
在幫助減緩氣候變化到推動下一代計算硬體的進步等眾多應用中,發現具有理想特性的新材料的能力至關重要。人工智慧有潛力通過比其他計算方法或試錯法更有效地探索化學空間來加速材料發現和設計。儘管在材料數據、基準測試和模型的人工智慧方面取得了重大進展,但出現了一個障礙,即缺乏公開可用的訓練數據和開放的預訓練模型。為了解決這個問題,我們提出了一個 Meta FAIR 發布的 Open Materials 2024(OMat24)大規模開放數據集,以及一組相應的預訓練模型。OMat24 包含超過1.1億個密度泛函理論(DFT)計算,重點關注結構和成分多樣性。我們的 EquiformerV2 模型在 Matbench Discovery 排行榜上實現了最先進的性能,能夠預測基態穩定性和形成能量,其 F1 分數超過0.9,準確度分別達到20 毫電子伏特/原子。我們探討了模型大小、輔助去噪目標以及對 OMat24、MPtraj 和 Alexandria 等一系列數據集的性能進行微調的影響。OMat24 數據集和模型的開放發布使研究社區能夠在我們的努力基礎上進一步推動人工智慧輔助材料科學的進步。
English
The ability to discover new materials with desirable properties is critical
for numerous applications from helping mitigate climate change to advances in
next generation computing hardware. AI has the potential to accelerate
materials discovery and design by more effectively exploring the chemical space
compared to other computational methods or by trial-and-error. While
substantial progress has been made on AI for materials data, benchmarks, and
models, a barrier that has emerged is the lack of publicly available training
data and open pre-trained models. To address this, we present a Meta FAIR
release of the Open Materials 2024 (OMat24) large-scale open dataset and an
accompanying set of pre-trained models. OMat24 contains over 110 million
density functional theory (DFT) calculations focused on structural and
compositional diversity. Our EquiformerV2 models achieve state-of-the-art
performance on the Matbench Discovery leaderboard and are capable of predicting
ground-state stability and formation energies to an F1 score above 0.9 and an
accuracy of 20 meV/atom, respectively. We explore the impact of model size,
auxiliary denoising objectives, and fine-tuning on performance across a range
of datasets including OMat24, MPtraj, and Alexandria. The open release of the
OMat24 dataset and models enables the research community to build upon our
efforts and drive further advancements in AI-assisted materials science.Summary
AI-Generated Summary