一個多模式人工智慧共同操作員,用於單細胞分析與指導。
A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following
January 14, 2025
作者: Yin Fang, Xinle Deng, Kangwei Liu, Ningyu Zhang, Jingyang Qian, Penghui Yang, Xiaohui Fan, Huajun Chen
cs.AI
摘要
大型語言模型擅長解釋複雜的自然語言指令,使它們能夠執行各種任務。在生命科學中,單細胞RNA序列(scRNA-seq)數據被視為細胞生物學的“語言”,捕捉單個細胞水平上複雜的基因表達模式。然而,通過傳統工具與這種“語言”互動通常效率低下且不直觀,給研究人員帶來挑戰。為了應對這些限制,我們提出了InstructCell,一種多模式人工智能副駕駛,利用自然語言作為進行更直接靈活的單細胞分析的媒介。我們構建了一個全面的多模式指令數據集,將基於文本的指令與來自不同組織和物種的scRNA-seq概要配對。在此基礆上,我們開發了一種多模式細胞語言架構,能夠同時解釋和處理兩種模態。InstructCell賦予研究人員執行關鍵任務的能力,例如細胞類型標註、條件虛擬細胞生成和藥物敏感性預測,使用直觀的自然語言命令。廣泛的評估顯示,InstructCell始終達到或超出現有單細胞基礎模型的性能,同時適應各種實驗條件。更重要的是,InstructCell提供了一個易於使用且直觀的工具,用於探索複雜的單細胞數據,降低技術門檻,並實現更深入的生物學洞察。
English
Large language models excel at interpreting complex natural language
instructions, enabling them to perform a wide range of tasks. In the life
sciences, single-cell RNA sequencing (scRNA-seq) data serves as the "language
of cellular biology", capturing intricate gene expression patterns at the
single-cell level. However, interacting with this "language" through
conventional tools is often inefficient and unintuitive, posing challenges for
researchers. To address these limitations, we present InstructCell, a
multi-modal AI copilot that leverages natural language as a medium for more
direct and flexible single-cell analysis. We construct a comprehensive
multi-modal instruction dataset that pairs text-based instructions with
scRNA-seq profiles from diverse tissues and species. Building on this, we
develop a multi-modal cell language architecture capable of simultaneously
interpreting and processing both modalities. InstructCell empowers researchers
to accomplish critical tasks-such as cell type annotation, conditional
pseudo-cell generation, and drug sensitivity prediction-using straightforward
natural language commands. Extensive evaluations demonstrate that InstructCell
consistently meets or exceeds the performance of existing single-cell
foundation models, while adapting to diverse experimental conditions. More
importantly, InstructCell provides an accessible and intuitive tool for
exploring complex single-cell data, lowering technical barriers and enabling
deeper biological insights.Summary
AI-Generated Summary