ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
April 4th, 2025
基础智能体的进展与挑战:从类脑智能到进化、协作与安全系统
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
Bang Liu, Xinfeng Li, Jiayi Zhang, Jinlin Wang, Tanjin He, Sirui Hong, Hongzhang Liu, Shaokun Zhang, Kaitao Song, Kunlun Zhu, Yuheng Cheng, Suyuchen Wang, Xiaoqiang Wang, Yuyu Luo, Haibo Jin, Peiyan Zhang, Ollie Liu, Jiaqi Chen, Huan Zhang, Zhaoyang Yu, Haochen Shi, Boyan Li, Dekun Wu, Fengwei Teng, Xiaojun Jia, Jiawei Xu, Jinyu Xiang, Yizhang Lin, Tianming Liu, Tongliang Liu, Yu Su, Huan Sun, Glen Berseth, Jianyun Nie, Ian Foster, Logan Ward, Qingyun Wu, Yu Gu, Mingchen Zhuge, Xiangru Tang, Haohan Wang, Jiaxuan You, Chi Wang, Jian Pei, Qiang Yang, Xiaoliang Qi, Chenglin Wu
•
Mar 31, 2025
•
268
7
ZClip:面向大语言模型预训练的自适应峰值抑制技术
ZClip: Adaptive Spike Mitigation for LLM Pre-Training
Abhay Kumar, Louis Owen, Nilabhra Roy Chowdhury, Fabian Güra
•
Apr 3, 2025
•
76
2
超越像素的想象:基于推理的视觉编辑基准测试
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao, Peiyuan Zhang, Kexian Tang, Hao Li, Zicheng Zhang, Guangtao Zhai, Junchi Yan, Hua Yang, Xue Yang, Haodong Duan
•
Apr 3, 2025
•
67
2
GPT-ImgEval:全面评估GPT4o图像生成能力的基准测试
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan, Junyan Ye, Weijia Li, Zilong Huang, Shenghai Yuan, Xiangyang He, Kaiqing Lin, Jun He, Conghui He, Li Yuan
•
Apr 3, 2025
•
56
3
推理时扩展的通用奖励建模
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu, Peiyi Wang, Runxin Xu, Shirong Ma, Chong Ruan, Peng Li, Yang Liu, Yu Wu
•
Apr 3, 2025
•
54
6
JavisDiT:联合音视频扩散变换器与分层时空先验同步
JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization
Kai Liu, Wei Li, Lai Chen, Shengqiong Wu, Yanhao Zheng, Jiayi Ji, Fan Zhou, Rongxin Jiang, Jiebo Luo, Hao Fei, Tat-Seng Chua
•
Mar 30, 2025
•
54
4
视听控制视频扩散:基于掩码选择性状态空间建模的自然说话头像生成
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong, Zunnan Xu, Zixiang Zhou, Jun Zhou, Xiu Li, Qin Lin, Qinglin Lu, Dan Xu
•
Apr 3, 2025
•
43
7
SkyReels-A2:视频扩散变换器中的全能创作
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei, Debang Li, Di Qiu, Jiahua Wang, Yikun Dou, Rui Wang, Jingtao Xu, Mingyuan Fan, Guibin Chen, Yang Li, Yahui Zhou
•
Apr 3, 2025
•
36
3
WikiVideo:基于多视频的文章生成
WikiVideo: Article Generation from Multiple Videos
Alexander Martin, Reno Kriz, William Gantt Walden, Kate Sanders, Hannah Recknor, Eugene Yang, Francis Ferraro, Benjamin Van Durme
•
Apr 1, 2025
•
36
3
重新思考视觉语言模型的强化学习扩展:一个透明、从零开始的框架与全面评估方案
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
Yan Ma, Steffi Chern, Xuyang Shen, Yiran Zhong, Pengfei Liu
•
Apr 3, 2025
•
30
3
交错式语音-文本语言模型的扩展性分析
Scaling Analysis of Interleaved Speech-Text Language Models
Gallil Maimon, Michael Hassid, Amit Roth, Yossi Adi
•
Apr 3, 2025
•
28
2
ShortV:通过冻结无效层中的视觉标记实现高效多模态大语言模型
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers
Qianhao Yuan, Qingyu Zhang, Yanjiang Liu, Jiawei Chen, Yaojie Lu, Hongyu Lin, Jia Zheng, Xianpei Han, Le Sun
•
Apr 1, 2025
•
21
2
FreSca:揭示扩散模型中的缩放空间
FreSca: Unveiling the Scaling Space in Diffusion Models
Chao Huang, Susan Liang, Yunlong Tang, Li Ma, Yapeng Tian, Chenliang Xu
•
Apr 2, 2025
•
19
2
基于大语言模型的时间序列预测高效模型选择
Efficient Model Selection for Time Series Forecasting via LLMs
Wang Wei, Tiankai Yang, Hongjie Chen, Ryan A. Rossi, Yue Zhao, Franck Dernoncourt, Hoda Eldardiry
•
Apr 2, 2025
•
16
2
OpenCodeReasoning:推动数据蒸馏技术在编程竞赛中的创新应用
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
Wasi Uddin Ahmad, Sean Narenthiran, Somshubra Majumdar, Aleksander Ficek, Siddhartha Jain, Jocelyn Huang, Vahid Noroozi, Boris Ginsburg
•
Apr 2, 2025
•
15
3
解读无模型强化学习中的涌现规划
Interpreting Emergent Planning in Model-Free Reinforcement Learning
Thomas Bush, Stephen Chung, Usman Anwar, Adrià Garriga-Alonso, David Krueger
•
Apr 2, 2025
•
12
2
GenPRM:通过生成式推理扩展过程奖励模型的测试时计算能力
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
Jian Zhao, Runze Liu, Kaiyan Zhang, Zhimu Zhou, Junqi Gao, Dong Li, Jiafei Lyu, Zhouyi Qian, Biqing Qi, Xiu Li, Bowen Zhou
•
Apr 1, 2025
•
12
3
人工智能与机器人科学家在科学发现中的规模定律
Scaling Laws in Scientific Discovery with AI and Robot Scientists
Pengsong Zhang, Heng Zhang, Huazhe Xu, Renjun Xu, Zhenting Wang, Cong Wang, Animesh Garg, Zhibin Li, Arash Ajoudani, Xinyu Liu
•
Mar 28, 2025
•
12
2
NeuralGS:融合神经场与3D高斯溅射,构建紧凑型3D表征
NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations
Zhenyu Tang, Chaoran Feng, Xinhua Cheng, Wangbo Yu, Junwu Zhang, Yuan Liu, Xiaoxiao Long, Wenping Wang, Li Yuan
•
Mar 29, 2025
•
11
2
稀疏自编码器在视觉语言模型中学习单语义特征
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Mateusz Pach, Shyamgopal Karthik, Quentin Bouniot, Serge Belongie, Zeynep Akata
•
Apr 3, 2025
•
10
2
Whisper-LM:利用语言模型提升低资源语言的自动语音识别性能
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo, Eva Navas, Ibon Saratxaga, Inma Hernáez Rioja
•
Mar 30, 2025
•
10
3
指令引导的自回归神经网络参数生成
Instruction-Guided Autoregressive Neural Network Parameter Generation
Soro Bedionita, Bruno Andreis, Song Chong, Sung Ju Hwang
•
Apr 2, 2025
•
6
2
场景中心的无监督全景分割
Scene-Centric Unsupervised Panoptic Segmentation
Oliver Hahn, Christoph Reich, Nikita Araslanov, Daniel Cremers, Christian Rupprecht, Stefan Roth
•
Apr 2, 2025
•
5
3