PepTune:利用多目標引導的離散擴散進行治療肽的全新生成
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
December 23, 2024
作者: Sophia Tang, Yinuo Zhang, Pranam Chatterjee
cs.AI
摘要
肽治療學是一類重要的藥物,已在糖尿病和癌症等疾病中取得顯著成功,其中具有里程碑意義的例子包括GLP-1 受體激動劑,徹底改變了第二型糖尿病和肥胖症的治療方式。儘管取得成功,設計滿足多個相互衝突目標的肽,如靶點結合親和力、溶解度和膜滲透性,仍然是一個重大挑戰。傳統的藥物開發和基於結構的設計對於這些任務效果不佳,因為它們無法優化對於治療效果至關重要的全局功能特性。現有的生成框架主要局限於連續空間、無條件輸出或單目標指導,這使它們不適用於跨多個特性的離散序列優化。為了解決這個問題,我們提出了PepTune,這是一個用於同時生成和優化治療性肽SMILES的多目標離散擴散模型。基於Masked Discrete Language Model(MDLM)框架構建的PepTune確保通過狀態依賴的遮罩計劃和基於懲罰的目標來確保有效的肽結構。為了引導擴散過程,我們提出了一種基於蒙特卡羅樹搜索(MCTS)的策略,平衡探索和利用,以迭代地優化帕累托最優序列。MCTS將基於分類器的獎勵與搜索樹擴展相結合,克服了離散空間固有的梯度估計挑戰和數據稀疏性。使用PepTune,我們生成了多樣化的化學修飾肽,經過優化,具有多種治療性能,包括靶點結合親和力、膜滲透性、溶解度、溶血性和在各種與疾病相關的靶點上的非污染特性。總的來說,我們的結果表明,MCTS引導的離散擴散是離散狀態空間中多目標序列設計的一種強大且模塊化方法。
English
Peptide therapeutics, a major class of medicines, have achieved remarkable
success across diseases such as diabetes and cancer, with landmark examples
such as GLP-1 receptor agonists revolutionizing the treatment of type-2
diabetes and obesity. Despite their success, designing peptides that satisfy
multiple conflicting objectives, such as target binding affinity, solubility,
and membrane permeability, remains a major challenge. Classical drug
development and structure-based design are ineffective for such tasks, as they
fail to optimize global functional properties critical for therapeutic
efficacy. Existing generative frameworks are largely limited to continuous
spaces, unconditioned outputs, or single-objective guidance, making them
unsuitable for discrete sequence optimization across multiple properties. To
address this, we present PepTune, a multi-objective discrete diffusion model
for the simultaneous generation and optimization of therapeutic peptide SMILES.
Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures
valid peptide structures with state-dependent masking schedules and
penalty-based objectives. To guide the diffusion process, we propose a Monte
Carlo Tree Search (MCTS)-based strategy that balances exploration and
exploitation to iteratively refine Pareto-optimal sequences. MCTS integrates
classifier-based rewards with search-tree expansion, overcoming gradient
estimation challenges and data sparsity inherent to discrete spaces. Using
PepTune, we generate diverse, chemically-modified peptides optimized for
multiple therapeutic properties, including target binding affinity, membrane
permeability, solubility, hemolysis, and non-fouling characteristics on various
disease-relevant targets. In total, our results demonstrate that MCTS-guided
discrete diffusion is a powerful and modular approach for multi-objective
sequence design in discrete state spaces.Summary
AI-Generated Summary