ChatPaper.ai
打開菜單
首頁
每日論文
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
December 10th, 2024
OmniDocBench:具有全面標註的多元 PDF 文件解析基準測試
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Linke Ouyang, Yuan Qu, Hongbin Zhou, Jiawei Zhu, Rui Zhang, Qunshu Lin, Bin Wang, Zhiyuan Zhao, Man Jiang, Xiaomeng Zhao, Jin Shi, Fan Wu, Pei Chu, Minghao Liu, Zhenxiang Li, Chao Xu, Bo Zhang, Botian Shi, Zhongying Tu, Conghui He
•
Dec 10, 2024
•
11
1
ProcessBench:在數學推理中識別過程錯誤
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin
•
Dec 9, 2024
•
78
6
在連續潛在空間中訓練大型語言模型進行推理
Training Large Language Models to Reason in a Continuous Latent Space
Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian
•
Dec 9, 2024
•
74
7
揭開強化學習智能體記憶複雜性的方法:分類與評估
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Egor Cherepanov, Nikita Kachaev, Artem Zholus, Alexey K. Kovalev, Aleksandr I. Panov
•
Dec 9, 2024
•
71
2
Maya:一個經過指令微調的多語言多模型模型
Maya: An Instruction Finetuned Multilingual Multimodal Model
Nahid Alam, Karthik Reddy Kanjula, Surya Guthikonda, Timothy Chung, Bala Krishna S Vegesna, Abhipsha Das, Anthony Susevski, Ryan Sze-Yin Chan, S M Iftekhar Uddin, Shayekh Bin Islam, Roshan Santhosh, Snegha A, Drishti Sharma, Chen Liu, Isha Chaturvedi, Genta Indra Winata, Ashvanth. S, Snehanshu Mukherjee, Alham Fikri Aji
•
Dec 10, 2024
•
26
2
80步遊世界:一種生成式方法用於全球視覺地理定位
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
Nicolas Dufour, David Picard, Vicky Kalogeiton, Loic Landrieu
•
Dec 9, 2024
•
20
2
Divot:擴散動力視頻分詞器用於理解和生成
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Yuying Ge, Yizhuo Li, Yixiao Ge, Ying Shan
•
Dec 5, 2024
•
15
2
探索多粒度概念標註以應用於多模態大型語言模型
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models
Xiao Xu, Tianhao Niu, Yuxi Xie, Libo Qin, Wanxiang Che, Min-Yen Kan
•
Dec 8, 2024
•
15
2
看見即擁有:在大規模無姿勢影片中學習3D創作
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Baorui Ma, Huachen Gao, Haoge Deng, Zhengxiong Luo, Tiejun Huang, Lulu Tang, Xinlong Wang
•
Dec 9, 2024
•
12
3
閘門式德爾塔網絡:利用德爾塔規則改進Mamba2
Gated Delta Networks: Improving Mamba2 with Delta Rule
Songlin Yang, Jan Kautz, Ali Hatamizadeh
•
Dec 9, 2024
•
10
3
MotionShop:在具有混合分數引導的視頻擴散模型中的零樣式遷移
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance
Hidir Yesiltepe, Tuna Han Salih Meral, Connor Dunlop, Pinar Yanardag
•
Dec 6, 2024
•
7
2
地球的全球和密集嵌入:潛在空間中漂浮的 Major TOM
Global and Dense Embeddings of Earth: Major TOM Floating in the Latent Space
Mikolaj Czerkawski, Marcin Kluczek, Jędrzej S. Bojanowski
•
Dec 7, 2024
•
7
2
MAtCha高斯:從稀疏視圖中獲得高質量幾何和照片逼真度的圖表集
MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views
Antoine Guédon, Tomoki Ichikawa, Kohei Yamashita, Ko Nishino
•
Dec 9, 2024
•
6
2
CARP:透過粗到細的自回歸預測學習視覺運動策略
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Zhefei Gong, Pengxiang Ding, Shangke Lyu, Siteng Huang, Mingyang Sun, Wei Zhao, Zhaoxin Fan, Donglin Wang
•
Dec 9, 2024
•
6
2
基於LLM的強韌多比特文本水印
Robust Multi-bit Text Watermark with LLM-based Paraphrasers
Xiaojun Xu, Jinghan Jia, Yuanshun Yao, Yang Liu, Hang Li
•
Dec 4, 2024
•
5
2
如果你無法使用它們,就回收它們:優化大規模合併以減輕性能折衷。
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs
Muhammad Khalifa, Yi-Chern Tan, Arash Ahmadian, Tom Hosking, Honglak Lee, Lu Wang, Ahmet Üstün, Tom Sherborne, Matthias Gallé
•
Dec 5, 2024
•
4
2
Turbo3D:超快速文本轉3D生成
Turbo3D: Ultra-fast Text-to-3D Generation
Hanzhe Hu, Tianwei Yin, Fujun Luan, Yiwei Hu, Hao Tan, Zexiang Xu, Sai Bi, Shubham Tulsiani, Kai Zhang
•
Dec 5, 2024
•
3
2