ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
November 28th, 2024
ROICtrl:增强视觉生成的实例控制
ROICtrl: Boosting Instance Control for Visual Generation
Yuchao Gu, Yipin Zhou, Yunfan Ye, Yixin Nie, Licheng Yu, Pingchuan Ma, Kevin Qinghong Lin, Mike Zheng Shou
•
Nov 27, 2024
•
71
2
交错场景图用于交错文本和图像生成的评估
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment
Dongping Chen, Ruoxi Chen, Shu Pu, Zhaoyi Liu, Yanru Wu, Caixi Chen, Benlin Liu, Yue Huang, Yao Wan, Pan Zhou, Ranjay Krishna
•
Nov 26, 2024
•
19
2
通过频率分解实现保持身份的文本到视频生成
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan
•
Nov 26, 2024
•
13
3
MARVEL-40M+: 多层次视觉阐释用于高保真度文本到3D内容创作
MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama, Shino Sam, Didier Stricker, Sk Aziz Ali, Muhammad Zeshan Afzal
•
Nov 26, 2024
•
21
4
CAT4D:使用多视角视频扩散模型在4D中创造任何事物
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
Rundi Wu, Ruiqi Gao, Ben Poole, Alex Trevithick, Changxi Zheng, Jonathan T. Barron, Aleksander Holynski
•
Nov 27, 2024
•
57
5
大型语言模型驱动的图形用户界面代理:一项调查
Large Language Model-Brained GUI Agents: A Survey
Chaoyun Zhang, Shilin He, Jiaxu Qian, Bowen Li, Liqun Li, Si Qin, Yu Kang, Minghua Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
•
Nov 27, 2024
•
32
3
3D凸壳点渲染:使用3D平滑凸壳的辐射场渲染
3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes
Jan Held, Renaud Vandeghen, Abdullah Hamdi, Adrien Deliege, Anthony Cioppa, Silvio Giancola, Andrea Vedaldi, Bernard Ghanem, Marc Van Droogenbroeck
•
Nov 22, 2024
•
17
5
扩散自蒸馏用于零样本定制图像生成
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Shengqu Cai, Eric Chan, Yunzhi Zhang, Leonidas Guibas, Jiajun Wu, Gordon Wetzstein
•
Nov 27, 2024
•
16
6
DiffusionDrive:端到端自动驾驶的截断扩散模型
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving
Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, Xinggang Wang
•
Nov 22, 2024
•
15
2
Make-It-Animatable:一种用于创建动画就绪3D角色的高效框架
Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters
Zhiyang Guo, Jinxu Xiang, Kai Ma, Wengang Zhou, Houqiang Li, Ran Zhang
•
Nov 27, 2024
•
14
4
协作解码使视觉自回归建模更高效。
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
Zigeng Chen, Xinyin Ma, Gongfan Fang, Xinchao Wang
•
Nov 26, 2024
•
12
2
DreamCache:基于特征缓存的无微调轻量化个性化图像生成
DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching
Emanuele Aiello, Umberto Michieli, Diego Valsesia, Mete Ozay, Enrico Magli
•
Nov 26, 2024
•
12
3
UniPose:一个统一的多模态框架,用于人体姿势理解、生成和编辑。
UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing
Yiheng Li, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen
•
Nov 25, 2024
•
11
4
ChatRex:驯服多模态LLM实现联合感知和理解
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang, Gen luo, Yuqin Yang, Yuda Xiong, Yihao Chen, Zhaoyang Zeng, Tianhe Ren, Lei Zhang
•
Nov 27, 2024
•
10
3
使用多模控制进行视频引导的弗利音效生成
Video-Guided Foley Sound Generation with Multimodal Controls
Ziyang Chen, Prem Seetharaman, Bryan Russell, Oriol Nieto, David Bourgin, Andrew Owens, Justin Salamon
•
Nov 26, 2024
•
10
2
Omegance:扩散基合成中不同粒度的单一参数
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis
Xinyu Hou, Zongsheng Yue, Xiaoming Li, Chen Change Loy
•
Nov 26, 2024
•
7
2
草案模型知道何时停止:自我验证长度策略用于推测解码
Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding
Ziyin Zhang, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Rui Wang, Zhaopeng Tu
•
Nov 27, 2024
•
6
2
VideoLLM 知道何时发言:通过视频文本二重交互格式增强时效视频理解
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
Yueqian Wang, Xiaojun Meng, Yuxuan Wang, Jianxin Liang, Jiansheng Wei, Huishuai Zhang, Dongyan Zhao
•
Nov 27, 2024
•
5
2
利用MedNeXt优化脑肿瘤分割:BraTS 2024 SSA和儿科
Optimizing Brain Tumor Segmentation with MedNeXt: BraTS 2024 SSA and Pediatrics
Sarim Hashmi, Juan Lugo, Abdelrahman Elsayed, Dinesh Saggurthi, Mohammed Elseiagy, Alikhan Nurkamal, Jaskaran Walia, Fadillah Adamsyah Maani, Mohammad Yaqub
•
Nov 24, 2024
•
5
2
自适应盲全能图像恢复
Adaptive Blind All-in-One Image Restoration
David Serrano-Lozano, Luis Herranz, Shaolin Su, Javier Vazquez-Corral
•
Nov 27, 2024
•
4
2
使用基于模板的数据生成训练和评估语言模型
Training and Evaluating Language Models with Template-based Data Generation
Yifan Zhang
•
Nov 27, 2024
•
3
3
编辑即消失,我的面容不再停留:个人生物特征防御对抗恶意生成编辑
Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing
Hanhui Wang, Yihua Zhang, Ruizheng Bai, Yue Zhao, Sijia Liu, Zhengzhong Tu
•
Nov 25, 2024
•
2
3