AI研究论文每日精选
每日精选AI研究论文及翻译
Kanana:计算高效的双语语言模型Kanana: Compute-efficient Bilingual Language Models
Kanana:计算高效的双语语言模型
Kanana: Compute-efficient Bilingual Language Models
Kanana LLM Team, Yunju Bak, Hojin Lee, Minho Ryu, Jiyeon Ham, Seungjae Jung, Daniel Wontae Nam, Taegyeong Eo, Donghun Lee, Doohae Jung, Boseop Kim, Nayeon Kim, Jaesun Park, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Kyoung-Woon On, Seulye Baeg, Junrae Cho, Sunghee Jung, Jieun Kang, EungGyun Kim, Eunhwa Kim, Byeongil Ko, Daniel Lee, Minchul Lee, Miok Lee, Shinbok Lee, Gaeun Seo•Feb 26, 2025•632
GHOST 2.0:生成式高保真一次性头部迁移GHOST 2.0: generative high-fidelity one shot transfer of heads
GHOST 2.0:生成式高保真一次性头部迁移
GHOST 2.0: generative high-fidelity one shot transfer of heads
Alexander Groshev, Anastasiia Iashchenko, Pavel Paramonov, Denis Dimitrov, Andrey Kuznetsov•Feb 25, 2025•632
定理解释代理:面向大语言模型定理理解的多模态解释TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem
Understanding
定理解释代理:面向大语言模型定理理解的多模态解释
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem
Understanding
Max Ku, Thomas Chong, Jonathan Leung, Krish Shah, Alvin Yu, Wenhu Chen•Feb 26, 2025•432
迈向AI科研协作伙伴Towards an AI co-scientist
迈向AI科研协作伙伴
Towards an AI co-scientist
Juraj Gottweis, Wei-Hung Weng, Alexander Daryin, Tao Tu, Anil Palepu, Petar Sirkovic, Artiom Myaskovsky, Felix Weissenberger, Keran Rong, Ryutaro Tanno, Khaled Saab, Dan Popovici, Jacob Blum, Fan Zhang, Katherine Chou, Avinatan Hassidim, Burak Gokturk, Amin Vahdat, Pushmeet Kohli, Yossi Matias, Andrew Carroll, Kavita Kulkarni, Nenad Tomasev, Yuan Guan, Vikram Dhillon, Eeshit Dhaval Vaishnav, Byron Lee, Tiago R D Costa, José R Penadés, Gary Peltz, Yunhan Xu, Annalisa Pawlosky, Alan Karthikesalingam, Vivek Natarajan•Feb 26, 2025•432
Plutus:大型语言模型在希腊低资源金融领域的基准测试Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
Plutus:大型语言模型在希腊低资源金融领域的基准测试
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
Xueqing Peng, Triantafillos Papadopoulos, Efstathia Soufleri, Polydoros Giannouris, Ruoyu Xiang, Yan Wang, Lingfei Qian, Jimin Huang, Qianqian Xie, Sophia Ananiadou•Feb 26, 2025•322
语言模型的事实准确性取决于查询所使用的语言Language Models' Factuality Depends on the Language of Inquiry
语言模型的事实准确性取决于查询所使用的语言
Language Models' Factuality Depends on the Language of Inquiry
Tushar Aggarwal, Kumar Tanmay, Ayush Agrawal, Kumar Ayush, Hamid Palangi, Paul Pu Liang•Feb 25, 2025•302
大型语言模型能否检测长链思维推理中的错误?Can Large Language Models Detect Errors in Long Chain-of-Thought
Reasoning?
大型语言模型能否检测长链思维推理中的错误?
Can Large Language Models Detect Errors in Long Chain-of-Thought
Reasoning?
Yancheng He, Shilong Li, Jiaheng Liu, Weixun Wang, Xingyuan Bu, Ge Zhang, Zhongyuan Peng, Zhaoxiang Zhang, Wenbo Su, Bo Zheng•Feb 26, 2025•262
Rank1:信息检索中重排序的测试时计算Rank1: Test-Time Compute for Reranking in Information Retrieval
Rank1:信息检索中重排序的测试时计算
Rank1: Test-Time Compute for Reranking in Information Retrieval
Orion Weller, Kathryn Ricci, Eugene Yang, Andrew Yates, Dawn Lawrie, Benjamin Van Durme•Feb 25, 2025•252
代理奖励建模:将人类偏好与可验证的正确性信号相结合,构建可靠的奖励系统Agentic Reward Modeling: Integrating Human Preferences with Verifiable
Correctness Signals for Reliable Reward Systems
代理奖励建模:将人类偏好与可验证的正确性信号相结合,构建可靠的奖励系统
Agentic Reward Modeling: Integrating Human Preferences with Verifiable
Correctness Signals for Reliable Reward Systems
Hao Peng, Yunjia Qi, Xiaozhi Wang, Zijun Yao, Bin Xu, Lei Hou, Juanzi Li•Feb 26, 2025•212
语言模型能否进行证伪?通过反例生成评估算法推理能力Can Language Models Falsify? Evaluating Algorithmic Reasoning with
Counterexample Creation
语言模型能否进行证伪?通过反例生成评估算法推理能力
Can Language Models Falsify? Evaluating Algorithmic Reasoning with
Counterexample Creation
Shiven Sinha, Shashwat Goel, Ponnurangam Kumaraguru, Jonas Geiping, Matthias Bethge, Ameya Prabhu•Feb 26, 2025•192
亚历山大项目:借助大语言模型解放科学知识,摆脱版权束缚Project Alexandria: Towards Freeing Scientific Knowledge from Copyright
Burdens via LLMs
亚历山大项目:借助大语言模型解放科学知识,摆脱版权束缚
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright
Burdens via LLMs
Christoph Schuhmann, Gollam Rabby, Ameya Prabhu, Tawsif Ahmed, Andreas Hochlehnert, Huu Nguyen, Nick Akinci Heidrich, Ludwig Schmidt, Robert Kaczmarczyk, Sören Auer, Jenia Jitsev, Matthias Bethge•Feb 26, 2025•193
深度蒸馏:蒸馏技术打造更强大的单目深度估计器Distill Any Depth: Distillation Creates a Stronger Monocular Depth
Estimator
深度蒸馏:蒸馏技术打造更强大的单目深度估计器
Distill Any Depth: Distillation Creates a Stronger Monocular Depth
Estimator
Xiankang He, Dongyan Guo, Hongji Li, Ruibo Li, Ying Cui, Chi Zhang•Feb 26, 2025•115
VEM:基于价值环境模型的无环境探索式GUI代理训练VEM: Environment-Free Exploration for Training GUI Agent with Value
Environment Model
VEM:基于价值环境模型的无环境探索式GUI代理训练
VEM: Environment-Free Exploration for Training GUI Agent with Value
Environment Model
Jiani Zheng, Lu Wang, Fangkai Yang, Chaoyun Zhang, Lingrui Mei, Wenjie Yin, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang•Feb 26, 2025•112
CritiQ:基于人类偏好的数据质量准则挖掘CritiQ: Mining Data Quality Criteria from Human Preferences
CritiQ:基于人类偏好的数据质量准则挖掘
CritiQ: Mining Data Quality Criteria from Human Preferences
Honglin Guo, Kai Lv, Qipeng Guo, Tianyi Liang, Zhiheng Xi, Demin Song, Qiuyinzhe Zhang, Yu Sun, Kai Chen, Xipeng Qiu, Tao Gui•Feb 26, 2025•92
Drop-Upcycling:通过部分重初始化训练稀疏专家混合模型Drop-Upcycling: Training Sparse Mixture of Experts with Partial
Re-initialization
Drop-Upcycling:通过部分重初始化训练稀疏专家混合模型
Drop-Upcycling: Training Sparse Mixture of Experts with Partial
Re-initialization
Taishi Nakamura, Takuya Akiba, Kazuki Fujii, Yusuke Oda, Rio Yokota, Jun Suzuki•Feb 26, 2025•63
BIG-Bench 极限挑战BIG-Bench Extra Hard
BIG-Bench 极限挑战
BIG-Bench Extra Hard
Mehran Kazemi, Bahare Fatemi, Hritik Bansal, John Palowitch, Chrysovalantis Anastasiou, Sanket Vaibhav Mehta, Lalit K. Jain, Virginia Aglietti, Disha Jindal, Peter Chen, Nishanth Dikkala, Gladys Tyen, Xin Liu, Uri Shalit, Silvia Chiappa, Kate Olszewska, Yi Tay, Vinh Q. Tran, Quoc V. Le, Orhan Firat•Feb 26, 2025•62
适应口音差异的空中交通管制通信自动语音识别Adapting Automatic Speech Recognition for Accented Air Traffic Control
Communications
适应口音差异的空中交通管制通信自动语音识别
Adapting Automatic Speech Recognition for Accented Air Traffic Control
Communications
Marcus Yu Zhe Wee, Justin Juin Hng Wong, Lynus Lim, Joe Yu Wei Tan, Prannaya Gupta, Dillion Lim, En Hao Tew, Aloysius Keng Siew Han, Yong Zhi Lim•Feb 27, 2025•52
FSPO:基于少量样本的合成偏好数据优化在LLMs中实现有效的真实用户个性化适配FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in
LLMs Elicits Effective Personalization to Real Users
FSPO:基于少量样本的合成偏好数据优化在LLMs中实现有效的真实用户个性化适配
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in
LLMs Elicits Effective Personalization to Real Users
Anikait Singh, Sheryl Hsu, Kyle Hsu, Eric Mitchell, Stefano Ermon, Tatsunori Hashimoto, Archit Sharma, Chelsea Finn•Feb 26, 2025•52
AISafetyLab:一个用于AI安全评估与提升的综合性框架AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and
Improvement
AISafetyLab:一个用于AI安全评估与提升的综合性框架
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and
Improvement
Zhexin Zhang, Leqi Lei, Junxiao Yang, Xijie Huang, Yida Lu, Shiyao Cui, Renmiao Chen, Qinglin Zhang, Xinyuan Wang, Hao Wang, Hao Li, Xianqi Lei, Chengwei Pan, Lei Sha, Hongning Wang, Minlie Huang•Feb 24, 2025•52
MolSpectra:基于多模态能谱的3D分子表征预训练MolSpectra: Pre-training 3D Molecular Representation with Multi-modal
Energy Spectra
MolSpectra:基于多模态能谱的3D分子表征预训练
MolSpectra: Pre-training 3D Molecular Representation with Multi-modal
Energy Spectra
Liang Wang, Shaozhen Liu, Yu Rong, Deli Zhao, Qiang Liu, Shu Wu, Liang Wang•Feb 22, 2025•52
迈向最优的多草案推测解码Towards Optimal Multi-draft Speculative Decoding
迈向最优的多草案推测解码
Towards Optimal Multi-draft Speculative Decoding
Zhengmian Hu, Tong Zheng, Vignesh Viswanathan, Ziyi Chen, Ryan A. Rossi, Yihan Wu, Dinesh Manocha, Heng Huang•Feb 26, 2025•42
and Reasoning Tasks
MMKE-Bench:面向多样化视觉知识与推理任务的多模态编辑基准MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
and Reasoning Tasks
MMKE-Bench:面向多样化视觉知识与推理任务的多模态编辑基准
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
Yuntao Du, Kailin Jiang, Zhi Gao, Chenrui Shi, Zilong Zheng, Siyuan Qi, Qing Li•Feb 27, 2025•32
PosterSum:面向科研海报摘要的多模态基准数据集PosterSum: A Multimodal Benchmark for Scientific Poster Summarization
PosterSum:面向科研海报摘要的多模态基准数据集
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization
Rohit Saxena, Pasquale Minervini, Frank Keller•Feb 24, 2025•22
DOEI:面向注意力增强类激活图的双重嵌入信息优化DOEI: Dual Optimization of Embedding Information for Attention-Enhanced
Class Activation Maps
DOEI:面向注意力增强类激活图的双重嵌入信息优化
DOEI: Dual Optimization of Embedding Information for Attention-Enhanced
Class Activation Maps
Hongjie Zhu, Zeyu Zhang, Guansong Pang, Xu Wang, Shimin Wen, Yu Bai, Daji Ergu, Ying Cai, Yang Zhao•Feb 21, 2025•22