ChatPaper.ai
打开菜单
首页
每日论文
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
February 6th, 2025
SmolLM2:当Smol变得强大——小型语言模型的数据中心训练
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Loubna Ben Allal, Anton Lozhkov, Elie Bakouch, Gabriel Martín Blázquez, Guilherme Penedo, Lewis Tunstall, Andrés Marafioti, Hynek Kydlíček, Agustín Piqueres Lajarín, Vaibhav Srivastav, Joshua Lochner, Caleb Fahlgren, Xuan-Son Nguyen, Clémentine Fourrier, Ben Burtenshaw, Hugo Larcher, Haojun Zhao, Cyril Zakka, Mathieu Morlon, Colin Raffel, Leandro von Werra, Thomas Wolf
•
Feb 4, 2025
•
207
6
LIMO:推理中的“少即是多”
LIMO: Less is More for Reasoning
Yixin Ye, Zhen Huang, Yang Xiao, Ethan Chern, Shijie Xia, Pengfei Liu
•
Feb 5, 2025
•
59
4
在LLMs中揭秘长链推理
Demystifying Long Chain-of-Thought Reasoning in LLMs
Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue
•
Feb 5, 2025
•
58
3
TwinMarket:用于金融市场的可扩展行为和社交模拟系统
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
Yuzhe Yang, Yifei Zhang, Minghao Wu, Kaidi Zhang, Yunmiao Zhang, Honghai Yu, Yan Hu, Benyou Wang
•
Feb 3, 2025
•
33
3
通过MCTS自动化结构化思维增强多模态推理
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Jinyang Wu, Mingkuan Feng, Shuai Zhang, Ruihan Jin, Feihu Che, Zengqi Wen, Jianhua Tao
•
Feb 4, 2025
•
22
4
LayerTracer:通过扩散Transformer实现与认知对齐的分层SVG合成
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
Yiren Song, Danze Chen, Mike Zheng Shou
•
Feb 3, 2025
•
20
4
关于语言模型蒸馏中的教师模型破解
On Teacher Hacking in Language Model Distillation
Daniil Tiapkin, Daniele Calandriello, Johan Ferret, Sarah Perrin, Nino Vieillard, Alexandre Ramé, Mathieu Blondel
•
Feb 4, 2025
•
18
2
令牌混合:混合潜在令牌和文本令牌以提高语言模型推理
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng
•
Feb 5, 2025
•
15
2
大型语言模型引导的自我调试代码生成
Large Language Model Guided Self-Debugging Code Generation
Muntasir Adnan, Zhiwei Xu, Carlos C. N. Kuhn
•
Feb 5, 2025
•
13
2
基于粒子的蒙特卡洛方法的概率推断方法用于推理时的LLM缩放
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
Isha Puri, Shivchander Sudalairaj, Guangxuan Xu, Kai Xu, Akash Srivastava
•
Feb 3, 2025
•
10
3
使用通用多提示进行越狱
Jailbreaking with Universal Multi-Prompts
Yu-Ling Hsu, Hsuan Su, Shang-Tse Chen
•
Feb 3, 2025
•
9
2
大型语言模型的激活感知合并
Activation-Informed Merging of Large Language Models
Amin Heyrani Nobari, Kaveh Alimohammadi, Ali ArjomandBigdeli, Akash Srivastava, Faez Ahmed, Navid Azizan
•
Feb 4, 2025
•
5
2
解谜!隐秘的成员推断用于检索增强生成
Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation
Ali Naseh, Yuefeng Peng, Anshuman Suri, Harsh Chaudhari, Alina Oprea, Amir Houmansadr
•
Feb 1, 2025
•
5
2
HackerRank-ASTRA:评估大型语言模型在跨领域多文件项目问题上的正确性和一致性。
HackerRank-ASTRA: Evaluating Correctness & Consistency of Large Language Models on cross-domain multi-file project problems
Jun Xing, Mayur Bhatia, Sahil Phulwani, Darshan Suresh, Rafik Matta
•
Jan 31, 2025
•
0
2