ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
April 1st, 2025
MoCha:迈向电影级说话角色合成
MoCha: Towards Movie-Grade Talking Character Synthesis
Cong Wei, Bo Sun, Haoyu Ma, Ji Hou, Felix Juefei-Xu, Zecheng He, Xiaoliang Dai, Luxin Zhang, Kunpeng Li, Tingbo Hou, Animesh Sinha, Peter Vajda, Wenhu Chen
•
Mar 30, 2025
•
131
11
TextCrafter:在复杂视觉场景中精准呈现多重文本
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du, Zhennan Chen, Zhizhou Chen, Shan Gao, Xi Chen, Zhengkai Jiang, Jian Yang, Ying Tai
•
Mar 30, 2025
•
95
3
Open-Reasoner-Zero:一种基于基础模型扩展强化学习的开源方案
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Jingcheng Hu, Yinmin Zhang, Qi Han, Daxin Jiang, Xiangyu Zhang, Heung-Yeung Shum
•
Mar 31, 2025
•
63
3
何谓、如何、何处及成效几何?大语言模型测试时缩放技术综述
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Qiyuan Zhang, Fuyuan Lyu, Zexu Sun, Lei Wang, Weixu Zhang, Zhihan Guo, Yufei Wang, Irwin King, Xue Liu, Chen Ma
•
Mar 31, 2025
•
53
2
大规模推理模型的高效推断:综述
Efficient Inference for Large Reasoning Models: A Survey
Yue Liu, Jiaying Wu, Yufei He, Hongcheng Gao, Hongyu Chen, Baolong Bi, Jiaheng Zhang, Zhiqi Huang, Bryan Hooi
•
Mar 29, 2025
•
46
3
TokenHSI:通过任务标记化实现物理人-场景交互的统一合成
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization
Liang Pan, Zeshi Yang, Zhiyang Dou, Wenjia Wang, Buzhen Huang, Bo Dai, Taku Komura, Jingbo Wang
•
Mar 25, 2025
•
39
3
独角兽:面向视觉语言模型训练的纯文本数据合成
Unicorn: Text-Only Data Synthesis for Vision Language Model Training
Xiaomin Yu, Pengxiang Ding, Wenjie Zhang, Siteng Huang, Songyang Gao, Chengwei Qin, Kejian Wu, Zhaoxin Fan, Ziyue Qiao, Donglin Wang
•
Mar 28, 2025
•
38
2
RIG:端到端通用策略中推理与想象的协同融合
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy
Zhonghan Zhao, Wenwei Zhang, Haian Huang, Kuikun Liu, Jianfei Gao, Gaoang Wang, Kai Chen
•
Mar 31, 2025
•
30
2
SketchVideo:基于草图的视频生成与编辑
SketchVideo: Sketch-based Video Generation and Editing
Feng-Lin Liu, Hongbo Fu, Xintao Wang, Weicai Ye, Pengfei Wan, Di Zhang, Lin Gao
•
Mar 30, 2025
•
23
3
通过思维干预有效控制推理模型
Effectively Controlling Reasoning Models through Thinking Intervention
Tong Wu, Chong Xiang, Jiachen T. Wang, Prateek Mittal
•
Mar 31, 2025
•
19
4
跨领域扩展可验证奖励的强化学习
Expanding RL with Verifiable Rewards Across Diverse Domains
Yi Su, Dian Yu, Linfeng Song, Juntao Li, Haitao Mi, Zhaopeng Tu, Min Zhang, Dong Yu
•
Mar 31, 2025
•
19
2
查询与征服:基于执行引导的SQL生成
Query and Conquer: Execution-Guided SQL Generation
Łukasz Borchmann, Marek Wydmuch
•
Mar 31, 2025
•
18
2
渐进式渲染蒸馏:无需3D数据,将稳定扩散模型适配于即时文本到网格生成
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
Zhiyuan Ma, Xinyue Liang, Rongyuan Wu, Xiangyu Zhu, Zhen Lei, Lei Zhang
•
Mar 27, 2025
•
16
2
TeleAntiFraud-28k:面向电信诈骗检测的音频-文本慢思考数据集
TeleAntiFraud-28k: A Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection
Zhiming Ma, Peidong Wang, Minhua Huang, Jingpeng Wang, Kai Wu, Xiangzhao Lv, Yachun Pang, Yin Yang, Wenjie Tang, Yuchen Kang
•
Mar 31, 2025
•
12
2
ActionStudio:一个轻量级框架,用于大规模动作模型的数据处理与训练
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
Jianguo Zhang, Thai Hoang, Ming Zhu, Zuxin Liu, Shiyu Wang, Tulika Awalgaonkar, Akshara Prabhakar, Haolin Chen, Weiran Yao, Zhiwei Liu, Juntao Tan, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong
•
Mar 28, 2025
•
12
2
利用LLM生成启发式函数的经典规划:以Python代码挑战现有技术
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code
Augusto B. Corrêa, André G. Pereira, Jendrik Seipp
•
Mar 24, 2025
•
10
1
AvatarArtist:开放域4D化身生成系统
AvatarArtist: Open-Domain 4D Avatarization
Hongyu Liu, Xuan Wang, Ziyu Wan, Yue Ma, Jingye Chen, Yanbo Fan, Yujun Shen, Yibing Song, Qifeng Chen
•
Mar 25, 2025
•
9
2
Easi3R:无需训练即可从DUSt3R中估计解耦运动
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
Xingyu Chen, Yue Chen, Yuliang Xiu, Andreas Geiger, Anpei Chen
•
Mar 31, 2025
•
7
2
MeshCraft:探索基于流式扩散变换器的高效可控网格生成
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
Xianglong He, Junyi Chen, Di Huang, Zexiang Liu, Xiaoshui Huang, Wanli Ouyang, Chun Yuan, Yangguang Li
•
Mar 29, 2025
•
7
2
DSO:通过模拟反馈对齐3D生成器以实现物理合理性
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
•
Mar 28, 2025
•
6
2
UPME:一种用于多模态大语言模型评估的无监督同行评审框架
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang, Munan Ning, Zheyuan Liu, Yanbo Wang, Jiayi Ye, Yue Huang, Shuo Yang, Xiao Chen, Yibing Song, Li Yuan
•
Mar 19, 2025
•
6
2
基于熵的自适应权重自训练方法
Entropy-Based Adaptive Weighting for Self-Training
Xiaoxuan Wang, Yihe Deng, Mingyu Derek Ma, Wei Wang
•
Mar 31, 2025
•
4
2
KOFFVQA:面向韩语大规模视觉-语言模型的客观评估自由形式视觉问答基准
KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language
Yoonshik Kim, Jaeyoon Jung
•
Mar 31, 2025
•
4
2
通过张量化连接进化多目标优化与GPU加速
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization
Zhenyu Liang, Hao Li, Naiwei Yu, Kebin Sun, Ran Cheng
•
Mar 26, 2025
•
4
3
PAVE:视频大语言模型的修补与适配
PAVE: Patching and Adapting Video Large Language Models
Zhuoming Liu, Yiquan Li, Khoi Duc Nguyen, Yiwu Zhong, Yin Li
•
Mar 25, 2025
•
4
2
低秩适应中的角度与强度解耦
Decoupling Angles and Strength in Low-rank Adaptation
Massimo Bini, Leander Girrbach, Zeynep Akata
•
Mar 23, 2025
•
4
2
理解现实场景中的伴随言语手势
Understanding Co-speech Gestures in-the-wild
Sindhu B Hegde, K R Prajwal, Taein Kwon, Andrew Zisserman
•
Mar 28, 2025
•
1
2