PepTune:利用多目標引導的離散擴散進行治療肽的全新生成

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

December 23, 2024
作者: Sophia Tang, Yinuo Zhang, Pranam Chatterjee
cs.AI

摘要

肽治療學是一類重要的藥物,已在糖尿病和癌症等疾病中取得顯著成功,其中具有里程碑意義的例子包括GLP-1 受體激動劑,徹底改變了第二型糖尿病和肥胖症的治療方式。儘管取得成功,設計滿足多個相互衝突目標的肽,如靶點結合親和力、溶解度和膜滲透性,仍然是一個重大挑戰。傳統的藥物開發和基於結構的設計對於這些任務效果不佳,因為它們無法優化對於治療效果至關重要的全局功能特性。現有的生成框架主要局限於連續空間、無條件輸出或單目標指導,這使它們不適用於跨多個特性的離散序列優化。為了解決這個問題,我們提出了PepTune,這是一個用於同時生成和優化治療性肽SMILES的多目標離散擴散模型。基於Masked Discrete Language Model(MDLM)框架構建的PepTune確保通過狀態依賴的遮罩計劃和基於懲罰的目標來確保有效的肽結構。為了引導擴散過程,我們提出了一種基於蒙特卡羅樹搜索(MCTS)的策略,平衡探索和利用,以迭代地優化帕累托最優序列。MCTS將基於分類器的獎勵與搜索樹擴展相結合,克服了離散空間固有的梯度估計挑戰和數據稀疏性。使用PepTune,我們生成了多樣化的化學修飾肽,經過優化,具有多種治療性能,包括靶點結合親和力、膜滲透性、溶解度、溶血性和在各種與疾病相關的靶點上的非污染特性。總的來說,我們的結果表明,MCTS引導的離散擴散是離散狀態空間中多目標序列設計的一種強大且模塊化方法。
English
Peptide therapeutics, a major class of medicines, have achieved remarkable success across diseases such as diabetes and cancer, with landmark examples such as GLP-1 receptor agonists revolutionizing the treatment of type-2 diabetes and obesity. Despite their success, designing peptides that satisfy multiple conflicting objectives, such as target binding affinity, solubility, and membrane permeability, remains a major challenge. Classical drug development and structure-based design are ineffective for such tasks, as they fail to optimize global functional properties critical for therapeutic efficacy. Existing generative frameworks are largely limited to continuous spaces, unconditioned outputs, or single-objective guidance, making them unsuitable for discrete sequence optimization across multiple properties. To address this, we present PepTune, a multi-objective discrete diffusion model for the simultaneous generation and optimization of therapeutic peptide SMILES. Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures valid peptide structures with state-dependent masking schedules and penalty-based objectives. To guide the diffusion process, we propose a Monte Carlo Tree Search (MCTS)-based strategy that balances exploration and exploitation to iteratively refine Pareto-optimal sequences. MCTS integrates classifier-based rewards with search-tree expansion, overcoming gradient estimation challenges and data sparsity inherent to discrete spaces. Using PepTune, we generate diverse, chemically-modified peptides optimized for multiple therapeutic properties, including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling characteristics on various disease-relevant targets. In total, our results demonstrate that MCTS-guided discrete diffusion is a powerful and modular approach for multi-objective sequence design in discrete state spaces.

Summary

AI-Generated Summary

PDF32December 26, 2024