2024年大型语言模型(LLM)黑客马拉松在材料科学和化学应用领域的反思。
Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry
November 20, 2024
作者: Yoel Zimmermann, Adib Bazgir, Zartashia Afzal, Fariha Agbere, Qianxiang Ai, Nawaf Alampara, Alexander Al-Feghali, Mehrad Ansari, Dmytro Antypov, Amro Aswad, Jiaru Bai, Viktoriia Baibakova, Devi Dutta Biswajeet, Erik Bitzek, Joshua D. Bocarsly, Anna Borisova, Andres M Bran, L. Catherine Brinson, Marcel Moran Calderon, Alessandro Canalicchio, Victor Chen, Yuan Chiang, Defne Circi, Benjamin Charmes, Vikrant Chaudhary, Zizhang Chen, Min-Hsueh Chiu, Judith Clymo, Kedar Dabhadkar, Nathan Daelman, Archit Datar, Matthew L. Evans, Maryam Ghazizade Fard, Giuseppe Fisicaro, Abhijeet Sadashiv Gangan, Janine George, Jose D. Cojal Gonzalez, Michael Götte, Ankur K. Gupta, Hassan Harb, Pengyu Hong, Abdelrahman Ibrahim, Ahmed Ilyas, Alishba Imran, Kevin Ishimwe, Ramsey Issa, Kevin Maik Jablonka, Colin Jones, Tyler R. Josephson, Greg Juhasz, Sarthak Kapoor, Rongda Kang, Ghazal Khalighinejad, Sartaaj Khan, Sascha Klawohn, Suneel Kuman, Alvin Noe Ladines, Sarom Leang, Magdalena Lederbauer, Sheng-Lun Mark Liao, Hao Liu, Xuefeng Liu, Stanley Lo, Sandeep Madireddy, Piyush Ranjan Maharana, Shagun Maheshwari, Soroush Mahjoubi, José A. Márquez, Rob Mills, Trupti Mohanty, Bernadette Mohr, Seyed Mohamad Moosavi, Alexander Moßhammer, Amirhossein D. Naghdi, Aakash Naik, Oleksandr Narykov, Hampus Näsström, Xuan Vu Nguyen, Xinyi Ni, Dana O'Connor, Teslim Olayiwola, Federico Ottomano, Aleyna Beste Ozhan, Sebastian Pagel, Chiku Parida, Jaehee Park, Vraj Patel, Elena Patyukova, Martin Hoffmann Petersen, Luis Pinto, José M. Pizarro, Dieter Plessers, Tapashree Pradhan, Utkarsh Pratiush, Charishma Puli, Andrew Qin, Mahyar Rajabi, Francesco Ricci, Elliot Risch, Martiño Ríos-García, Aritra Roy, Tehseen Rug, Hasan M Sayeed, Markus Scheidgen, Mara Schilling-Wilhelmi, Marcel Schloz, Fabian Schöppach, Julia Schumann, Philippe Schwaller, Marcus Schwarting, Samiha Sharlin, Kevin Shen, Jiale Shi, Pradip Si, Jennifer D'Souza, Taylor Sparks, Suraj Sudhakar, Leopold Talirz, Dandan Tang, Olga Taran, Carla Terboven, Mark Tropin, Anastasiia Tsymbal, Katharina Ueltzen, Pablo Andres Unzueta, Archit Vasan, Tirtha Vinchurkar, Trung Vo, Gabriel Vogel, Christoph Völker, Jan Weinreich, Faradawn Yang, Mohd Zaki, Chi Zhang, Sylvester Zhang, Weijie Zhang, Ruijie Zhu, Shang Zhu, Jan Janssen, Ian Foster, Ben Blaiszik
cs.AI
摘要
在这里,我们介绍了第二届大型语言模型(LLM)应用于材料科学和化学的黑客马拉松的成果,该活动吸引了来自全球混合地点的参与者,共收到34个团队的提交。这些提交涵盖了七个关键应用领域,展示了LLM在以下应用中的多样实用性:(1)分子和材料属性预测;(2)分子和材料设计;(3)自动化和新颖界面;(4)科学交流和教育;(5)研究数据管理和自动化;(6)假设生成和评估;以及(7)从科学文献中提取知识和推理。每个团队的提交都在摘要表格中展示,并附有代码链接和简要论文附录。除了团队成果,我们还讨论了黑客马拉松活动及其混合形式,包括在多伦多、蒙特利尔、旧金山、柏林、洛桑和东京设立的实体中心,以及一个全球在线中心,促进本地和虚拟协作。总体而言,这次活动突显了自上一届黑客马拉松以来LLM功能的显著改进,表明LLM在材料科学和化学研究中的应用持续扩展。这些成果展示了LLM的双重实用性,既是多功能模型用于各种机器学习任务,又是快速原型化定制科学研究应用的平台。
English
Here, we present the outcomes from the second Large Language Model (LLM)
Hackathon for Applications in Materials Science and Chemistry, which engaged
participants across global hybrid locations, resulting in 34 team submissions.
The submissions spanned seven key application areas and demonstrated the
diverse utility of LLMs for applications in (1) molecular and material property
prediction; (2) molecular and material design; (3) automation and novel
interfaces; (4) scientific communication and education; (5) research data
management and automation; (6) hypothesis generation and evaluation; and (7)
knowledge extraction and reasoning from scientific literature. Each team
submission is presented in a summary table with links to the code and as brief
papers in the appendix. Beyond team results, we discuss the hackathon event and
its hybrid format, which included physical hubs in Toronto, Montreal, San
Francisco, Berlin, Lausanne, and Tokyo, alongside a global online hub to enable
local and virtual collaboration. Overall, the event highlighted significant
improvements in LLM capabilities since the previous year's hackathon,
suggesting continued expansion of LLMs for applications in materials science
and chemistry research. These outcomes demonstrate the dual utility of LLMs as
both multipurpose models for diverse machine learning tasks and platforms for
rapid prototyping custom applications in scientific research.Summary
AI-Generated Summary