ChatPaper.aiChatPaper

fMRI-3D:一個全面的資料集,用於增強基於 fMRI 的 3D 重建。

fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction

September 17, 2024
作者: Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu
cs.AI

摘要

從功能性磁共振成像(fMRI)數據中重建3D視覺圖像,在我們的會議工作中被稱為Recon3DMind,對認知神經科學和計算機視覺都具有重要意義。為了推進這一任務,我們提出了fMRI-3D數據集,其中包含了來自15名參與者的數據,展示了總共4768個3D物體。該數據集包括兩個組成部分:fMRI-Shape,先前已介紹並可在https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape中獲取,以及本文提出的fMRI-Objaverse,可在https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse中獲得。fMRI-Objaverse包含了來自5名受試者的數據,其中4名也是fMRI-Shape中的核心組成部分,每名受試者查看了117個類別中的3142個3D物體,並附有文本標題。這顯著增強了數據集的多樣性和潛在應用。此外,我們提出了MinD-3D,一個新型框架,旨在從fMRI信號中解碼3D視覺信息。該框架首先使用神經融合編碼器從fMRI數據中提取和聚合特徵,然後利用特徵橋擴散模型生成視覺特徵,最後使用生成式變壓器解碼器重建3D物體。我們通過設計語義和結構級別的指標來建立新的基準,以評估模型的性能。此外,我們評估了我們模型在分布外情境中的有效性,並分析了從fMRI信號中提取的特徵和視覺ROI的歸因。我們的實驗表明,MinD-3D不僅可以高度準確地重建3D物體,還可以加深我們對人類大腦如何處理3D視覺信息的理解。項目頁面位於:https://jianxgao.github.io/MinD-3D。
English
Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI) data, introduced as Recon3DMind in our conference work, is of significant interest to both cognitive neuroscience and computer vision. To advance this task, we present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4768 3D objects. The dataset comprises two components: fMRI-Shape, previously introduced and accessible at https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape, and fMRI-Objaverse, proposed in this paper and available at https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse. fMRI-Objaverse includes data from 5 subjects, 4 of whom are also part of the Core set in fMRI-Shape, with each subject viewing 3142 3D objects across 117 categories, all accompanied by text captions. This significantly enhances the diversity and potential applications of the dataset. Additionally, we propose MinD-3D, a novel framework designed to decode 3D visual information from fMRI signals. The framework first extracts and aggregates features from fMRI data using a neuro-fusion encoder, then employs a feature-bridge diffusion model to generate visual features, and finally reconstructs the 3D object using a generative transformer decoder. We establish new benchmarks by designing metrics at both semantic and structural levels to evaluate model performance. Furthermore, we assess our model's effectiveness in an Out-of-Distribution setting and analyze the attribution of the extracted features and the visual ROIs in fMRI signals. Our experiments demonstrate that MinD-3D not only reconstructs 3D objects with high semantic and spatial accuracy but also deepens our understanding of how human brain processes 3D visual information. Project page at: https://jianxgao.github.io/MinD-3D.

Summary

AI-Generated Summary

PDF21November 16, 2024