ChatPaper.ai
打開菜單
首頁
每日論文
arXiv
HuggingFace
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
April 15th, 2025
InternVL3:探索開源多模態模型的高階訓練與測試時優化方案
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu, Weiyun Wang, Zhe Chen, Zhaoyang Liu, Shenglong Ye, Lixin Gu, Yuchen Duan, Hao Tian, Weijie Su, Jie Shao, Zhangwei Gao, Erfei Cui, Yue Cao, Yangzhou Liu, Weiye Xu, Hao Li, Jiahao Wang, Han Lv, Dengnian Chen, Songze Li, Yinan He, Tan Jiang, Jiapeng Luo, Yi Wang, Conghui He, Botian Shi, Xingcheng Zhang, Wenqi Shao, Junjun He, Yingtong Xiong, Wenwen Qu, Peng Sun, Penglong Jiao, Lijun Wu, Kaipeng Zhang, Huipeng Deng, Jiaye Ge, Kai Chen, Limin Wang, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang
•
Apr 14, 2025
•
222
8
PRIMA.CPP:在低資源日常家用集群上加速700億規模大語言模型推理
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Zonghang Li, Tao Li, Wenjiao Feng, Mohsen Guizani, Hongfang Yu
•
Apr 7, 2025
•
112
7
我們是否已經統一了圖像生成與理解?對GPT-4o圖像生成能力的實證研究
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability
Ning Li, Jingran Zhang, Justin Cui
•
Apr 9, 2025
•
44
2
VL-Rethinker:利用強化學習激勵視覺語言模型的自我反思
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang, Chao Qu, Zuming Huang, Wei Chu, Fangzhen Lin, Wenhu Chen
•
Apr 10, 2025
•
39
2
FUSION:視覺-語言表徵的完全整合,實現深度跨模態理解
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding
Zheng Liu, Mengjie Liu, Jingzhou Chen, Jingwei Xu, Bin Cui, Conghui He, Wentao Zhang
•
Apr 14, 2025
•
36
3
通過強化重排序實現代碼生成的迭代自訓練
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Nikita Sorokin, Ivan Sedykh, Valentin Malykh
•
Apr 13, 2025
•
31
2
Mavors:面向多模态大型語言模型的多粒度視頻表徵
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Yang Shi, Jiaheng Liu, Yushuo Guan, Zhenhua Wu, Yuanxing Zhang, Zihao Wang, Weihong Lin, Jingyun Hua, Zekun Wang, Xinlong Chen, Bohan Zeng, Wentao Zhang, Fuzheng Zhang, Wenjing Yang, Di Zhang
•
Apr 14, 2025
•
28
2
AgentRewardBench:評估網路代理軌跡的自動評量系統
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Xing Han Lù, Amirhossein Kazemnejad, Nicholas Meade, Arkil Patel, Dongchan Shin, Alejandra Zambrano, Karolina Stańczak, Peter Shaw, Christopher J. Pal, Siva Reddy
•
Apr 11, 2025
•
24
2
S1-Bench:評估大型推理模型系統一思維能力的簡易基準
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
Wenyuan Zhang, Shuaiyi Nie, Xinghua Zhang, Zefeng Zhang, Tingwen Liu
•
Apr 14, 2025
•
19
3
DUMP:基於強化學習的大型語言模型分佈級自動化課程學習後訓練
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training
Zhenting Wang, Guofeng Cui, Kun Wan, Wentian Zhao
•
Apr 13, 2025
•
16
2
突破數據壁壘——通過任務泛化構建GUI代理
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang, Zichen Ding, Chang Ma, Zijie Chen, Qiushi Sun, Zhenzhong Lan, Junxian He
•
Apr 14, 2025
•
15
2
MIEB:大規模圖像嵌入基準測試
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao, Isaac Chung, Imene Kerboua, Jamie Stirling, Xin Zhang, Márton Kardos, Roman Solomatin, Noura Al Moubayed, Kenneth Enevoldsen, Niklas Muennighoff
•
Apr 14, 2025
•
14
2
TinyLLaVA-Video-R1:邁向更小型化的視訊推理多模態大模型
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Xingjian Zhang, Siwei Wen, Wenjun Wu, Lei Huang
•
Apr 13, 2025
•
14
3
SocioVerse:一個由LLM代理驅動的社會模擬世界模型,擁有千萬真實用戶池
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users
Xinnong Zhang, Jiayu Lin, Xinyi Mou, Shiyue Yang, Xiawei Liu, Libo Sun, Hanjia Lyu, Yihang Yang, Weihong Qi, Yue Chen, Guanying Li, Ling Yan, Yao Hu, Siming Chen, Yu Wang, Jingxuan Huang, Jiebo Luo, Shiping Tang, Libo Wu, Baohua Zhou, Zhongyu Wei
•
Apr 14, 2025
•
12
3
可執行的功能抽象:推斷高階數學問題的生成式程式
Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems
Zaid Khan, Elias Stengel-Eskin, Archiki Prasad, Jaemin Cho, Mohit Bansal
•
Apr 14, 2025
•
12
2
VisuoThink:透過多模態樹狀搜索強化LVLM推理能力
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search
Yikun Wang, Siyin Wang, Qinyuan Cheng, Zhaoye Fei, Liang Ding, Qipeng Guo, Dacheng Tao, Xipeng Qiu
•
Apr 12, 2025
•
10
4
推理模型無需思考也能有效運作
Reasoning Models Can Be Effective Without Thinking
Wenjie Ma, Jingxuan He, Charlie Snell, Tyler Griggs, Sewon Min, Matei Zaharia
•
Apr 14, 2025
•
9
2
AI科學家-v2:通過代理樹搜索實現工作坊級自動化科學發現
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
Yutaro Yamada, Robert Tjarko Lange, Cong Lu, Shengran Hu, Chris Lu, Jakob Foerster, Jeff Clune, David Ha
•
Apr 10, 2025
•
9
2
M1:邁向可擴展的測試時計算——基於Mamba推理模型
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Junxiong Wang, Wen-Ding Li, Daniele Paliotta, Daniel Ritter, Alexander M. Rush, Tri Dao
•
Apr 14, 2025
•
7
2
LLM-SRBench:基於大型語言模型的科學方程式發現新基準
LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models
Parshin Shojaee, Ngoc-Hieu Nguyen, Kazem Meidani, Amir Barati Farimani, Khoa D Doan, Chandan K Reddy
•
Apr 14, 2025
•
7
2
EmoAgent:評估與保障人機互動的心理健康安全
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu, Yinghui He, Xinzhe Juan, Yiming Wang, Yuhan Liu, Zixin Yao, Yue Wu, Xun Jiang, Ling Yang, Mengdi Wang
•
Apr 13, 2025
•
6
3
新數據如何滲透大型語言模型的知識體系及其稀釋方法
How new data permeates LLM knowledge and how to dilute it
Chen Sun, Renat Aksitov, Andrey Zhmoginov, Nolan Andrew Miller, Max Vladymyrov, Ulrich Rueckert, Been Kim, Mark Sandler
•
Apr 13, 2025
•
5
2
3D CoCa:對比學習者即三維描述生成器
3D CoCa: Contrastive Learners are 3D Captioners
Ting Huang, Zeyu Zhang, Yemin Wang, Hao Tang
•
Apr 13, 2025
•
4
2
大型語言模型可能成為危險的說服者:大型語言模型說服安全性的實證研究
LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models
Minqian Liu, Zhiyang Xu, Xinyi Zhang, Heajun An, Sarvech Qadir, Qi Zhang, Pamela J. Wisniewski, Jin-Hee Cho, Sang Won Lee, Ruoxi Jia, Lifu Huang
•
Apr 14, 2025
•
3
2
DeepSeek 對比 o3-mini:推理型大語言模型在機器翻譯與摘要任務評估中的表現如何?
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?
Daniil Larionov, Sotaro Takeshita, Ran Zhang, Yanran Chen, Christoph Leiter, Zhipin Wang, Christian Greisinger, Steffen Eger
•
Apr 10, 2025
•
3
2
MDK12-Bench:一個多學科基準,用於評估多模態大型語言模型的推理能力
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
Pengfei Zhou, Fanrui Zhang, Xiaopeng Peng, Zhaopan Xu, Jiaxin Ai, Yansheng Qiu, Chuanhao Li, Zhen Li, Ming Li, Yukang Feng, Jianwen Sun, Haoquan Zhang, Zizhen Li, Xiaofeng Mao, Wangbo Zhao, Kai Wang, Xiaojun Chang, Wenqi Shao, Yang You, Kaipeng Zhang
•
Apr 8, 2025
•
3
2
MCP安全審計:採用模型上下文協議的LLMs存在重大安全漏洞
MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits
Brandon Radosevich, John Halloran
•
Apr 2, 2025
•
2
2
DiffuMural:運用多尺度擴散技術修復敦煌壁畫
DiffuMural: Restoring Dunhuang Murals with Multi-scale Diffusion
Puyu Han, Jiaju Kang, Yuhang Pan, Erting Pan, Zeyu Zhang, Qunchao Jin, Juntao Jiang, Zhichen Liu, Luqi Gong
•
Apr 13, 2025
•
0
2