幻觉可以改善在药物发现中的大型语言模型
Hallucinations Can Improve Large Language Models in Drug Discovery
January 23, 2025
作者: Shuzhou Yuan, Michael Färber
cs.AI
摘要
研究人员提出了对大型语言模型(LLMs)中幻觉的担忧,然而在创造力至关重要的领域,如药物发现领域,它们的潜力值得探索。在本文中,我们提出了一个假设,即幻觉可以改善LLMs在药物发现中的表现。为验证这一假设,我们使用LLMs将分子的SMILES字符串用自然语言描述,然后将这些描述作为提示的一部分来处理药物发现中的特定任务。在七个LLMs和五个分类任务上进行评估,我们的研究结果证实了这一假设:LLMs在包含幻觉文本时可以实现更好的性能。值得注意的是,Llama-3.1-8B相比没有幻觉的基准模型,ROC-AUC增益达到18.35%。此外,由GPT-4o生成的幻觉在各模型中提供了最一致的改进。此外,我们进行了实证分析和案例研究,以调查影响性能和潜在原因的关键因素。我们的研究揭示了幻觉在LLMs中潜在应用的可能性,并为未来利用LLMs进行药物发现的研究提供了新的视角。
English
Concerns about hallucinations in Large Language Models (LLMs) have been
raised by researchers, yet their potential in areas where creativity is vital,
such as drug discovery, merits exploration. In this paper, we come up with the
hypothesis that hallucinations can improve LLMs in drug discovery. To verify
this hypothesis, we use LLMs to describe the SMILES string of molecules in
natural language and then incorporate these descriptions as part of the prompt
to address specific tasks in drug discovery. Evaluated on seven LLMs and five
classification tasks, our findings confirm the hypothesis: LLMs can achieve
better performance with text containing hallucinations. Notably, Llama-3.1-8B
achieves an 18.35% gain in ROC-AUC compared to the baseline without
hallucination. Furthermore, hallucinations generated by GPT-4o provide the most
consistent improvements across models. Additionally, we conduct empirical
analyses and a case study to investigate key factors affecting performance and
the underlying reasons. Our research sheds light on the potential use of
hallucinations for LLMs and offers new perspectives for future research
leveraging LLMs in drug discovery.Summary
AI-Generated Summary