PepTune:利用多目标引导的离散扩散进行治疗肽的全新生成
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
December 23, 2024
作者: Sophia Tang, Yinuo Zhang, Pranam Chatterjee
cs.AI
摘要
肽类药物是一类重要的药物,在糖尿病和癌症等疾病治疗中取得了显著成功,其中具有里程碑意义的例子包括GLP-1受体激动剂彻底改变了2型糖尿病和肥胖症的治疗方式。尽管取得了成功,设计满足多个相互冲突目标的肽类仍然是一个重大挑战,例如靶向结合亲和力、溶解度和膜渗透性。传统药物开发和基于结构的设计对于这样的任务是无效的,因为它们未能优化对治疗功效至关重要的全局功能特性。现有的生成框架主要局限于连续空间、非条件输出或单目标指导,这使它们不适用于跨多个属性进行离散序列优化。为了解决这个问题,我们提出了PepTune,这是一个用于同时生成和优化治疗性肽SMILES的多目标离散扩散模型。基于掩蔽离散语言模型(MDLM)框架构建的PepTune通过状态相关的掩蔽计划和基于惩罚的目标确保有效的肽结构。为了引导扩散过程,我们提出了一种基于蒙特卡洛树搜索(MCTS)的策略,平衡探索和开发,以迭代地优化帕累托最优序列。MCTS将基于分类器的奖励与搜索树扩展相结合,克服了离散空间固有的梯度估计挑战和数据稀疏性。利用PepTune,我们生成了多样化的、经过化学修饰的肽,针对多种与疾病相关的靶点进行了优化,包括靶向结合亲和力、膜渗透性、溶解度、溶血性和不易附着特性。总的来说,我们的结果表明,MCTS引导的离散扩散是离散状态空间中多目标序列设计的一种强大且模块化的方法。
English
Peptide therapeutics, a major class of medicines, have achieved remarkable
success across diseases such as diabetes and cancer, with landmark examples
such as GLP-1 receptor agonists revolutionizing the treatment of type-2
diabetes and obesity. Despite their success, designing peptides that satisfy
multiple conflicting objectives, such as target binding affinity, solubility,
and membrane permeability, remains a major challenge. Classical drug
development and structure-based design are ineffective for such tasks, as they
fail to optimize global functional properties critical for therapeutic
efficacy. Existing generative frameworks are largely limited to continuous
spaces, unconditioned outputs, or single-objective guidance, making them
unsuitable for discrete sequence optimization across multiple properties. To
address this, we present PepTune, a multi-objective discrete diffusion model
for the simultaneous generation and optimization of therapeutic peptide SMILES.
Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures
valid peptide structures with state-dependent masking schedules and
penalty-based objectives. To guide the diffusion process, we propose a Monte
Carlo Tree Search (MCTS)-based strategy that balances exploration and
exploitation to iteratively refine Pareto-optimal sequences. MCTS integrates
classifier-based rewards with search-tree expansion, overcoming gradient
estimation challenges and data sparsity inherent to discrete spaces. Using
PepTune, we generate diverse, chemically-modified peptides optimized for
multiple therapeutic properties, including target binding affinity, membrane
permeability, solubility, hemolysis, and non-fouling characteristics on various
disease-relevant targets. In total, our results demonstrate that MCTS-guided
discrete diffusion is a powerful and modular approach for multi-objective
sequence design in discrete state spaces.Summary
AI-Generated Summary