CAD-MLLM:将多模态条件CAD生成与MLLM统一起来
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM
November 7, 2024
作者: Jingwei Xu, Chenyu Wang, Zibo Zhao, Wen Liu, Yi Ma, Shenghua Gao
cs.AI
摘要
本文旨在设计一个统一的计算机辅助设计(CAD)生成系统,能够根据用户以文本描述、图像、点云甚至它们的组合形式输入,轻松生成CAD模型。为实现这一目标,我们介绍了CAD-MLLM,这是第一个能够生成基于多模态输入的参数化CAD模型的系统。具体而言,在CAD-MLLM框架内,我们利用CAD模型的命令序列,然后采用先进的大型语言模型(LLMs)来对齐这些多样的多模态数据和CAD模型的矢量化表示的特征空间。为了促进模型训练,我们设计了一个全面的数据构建和注释流水线,为每个CAD模型配备相应的多模态数据。我们得到的数据集名为Omni-CAD,是第一个包含文本描述、多视图图像、点和命令序列的多模态CAD数据集。它包含大约450K个实例及其CAD构建序列。为了全面评估我们生成的CAD模型的质量,我们超越了当前侧重于重建质量的评估指标,引入了评估拓扑质量和表面封闭程度的额外指标。大量实验结果表明,CAD-MLLM明显优于现有的条件生成方法,并且对噪声和缺失点具有很高的鲁棒性。项目页面和更多可视化内容可在以下网址找到:https://cad-mllm.github.io/
English
This paper aims to design a unified Computer-Aided Design (CAD) generation
system that can easily generate CAD models based on the user's inputs in the
form of textual description, images, point clouds, or even a combination of
them. Towards this goal, we introduce the CAD-MLLM, the first system capable of
generating parametric CAD models conditioned on the multimodal input.
Specifically, within the CAD-MLLM framework, we leverage the command sequences
of CAD models and then employ advanced large language models (LLMs) to align
the feature space across these diverse multi-modalities data and CAD models'
vectorized representations. To facilitate the model training, we design a
comprehensive data construction and annotation pipeline that equips each CAD
model with corresponding multimodal data. Our resulting dataset, named
Omni-CAD, is the first multimodal CAD dataset that contains textual
description, multi-view images, points, and command sequence for each CAD
model. It contains approximately 450K instances and their CAD construction
sequences. To thoroughly evaluate the quality of our generated CAD models, we
go beyond current evaluation metrics that focus on reconstruction quality by
introducing additional metrics that assess topology quality and surface
enclosure extent. Extensive experimental results demonstrate that CAD-MLLM
significantly outperforms existing conditional generative methods and remains
highly robust to noises and missing points. The project page and more
visualizations can be found at: https://cad-mllm.github.io/Summary
AI-Generated Summary