CineBrain：自然情境下视听叙事处理的大规模多模态脑数据集

摘要

本文介绍了CineBrain，这是首个在动态视听刺激下同步记录脑电图（EEG）和功能磁共振成像（fMRI）的大规模数据集。认识到EEG高时间分辨率与fMRI深脑空间覆盖的互补优势，CineBrain为六名参与者分别提供了约六小时来自热门电视剧《生活大爆炸》的叙事驱动内容。基于这一独特数据集，我们提出了CineSync，一种创新的多模态解码框架，该框架将多模态融合编码器与基于扩散的神经潜在解码器相结合。我们的方法有效融合了EEG和fMRI信号，显著提升了复杂视听刺激的重建质量。为了促进严格评估，我们引入了Cine-Benchmark，这是一个全面的评估协议，从语义和感知维度对重建结果进行评估。实验结果表明，CineSync在视频重建性能上达到了业界领先水平，并展示了我们首次成功结合fMRI和EEG重建视频和音频刺激的初步成果。项目页面：https://jianxgao.github.io/CineBrain。

English

In this paper, we introduce CineBrain, the first large-scale dataset featuring simultaneous EEG and fMRI recordings during dynamic audiovisual stimulation. Recognizing the complementary strengths of EEG's high temporal resolution and fMRI's deep-brain spatial coverage, CineBrain provides approximately six hours of narrative-driven content from the popular television series The Big Bang Theory for each of six participants. Building upon this unique dataset, we propose CineSync, an innovative multimodal decoding framework integrates a Multi-Modal Fusion Encoder with a diffusion-based Neural Latent Decoder. Our approach effectively fuses EEG and fMRI signals, significantly improving the reconstruction quality of complex audiovisual stimuli. To facilitate rigorous evaluation, we introduce Cine-Benchmark, a comprehensive evaluation protocol that assesses reconstructions across semantic and perceptual dimensions. Experimental results demonstrate that CineSync achieves state-of-the-art video reconstruction performance and highlight our initial success in combining fMRI and EEG for reconstructing both video and audio stimuli. Project Page: https://jianxgao.github.io/CineBrain.

CineBrain：自然情境下视听叙事处理的大规模多模态脑数据集

CineBrain: A Large-Scale Multi-Modal Brain Dataset During Naturalistic Audiovisual Narrative Processing

摘要

Summary

Support