ChatPaper.ai
打開菜單
首頁
每日論文
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
January 3rd, 2025
2.5年在課堂上:一本視覺語言預訓練的多模態教科書
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li, Jiashuo Sun, Yongliang Shen, Weiming Lu, Deli Zhao, Yueting Zhuang, Lidong Bing
•
Jan 1, 2025
•
95
7
VideoAnydoor:具有精確運動控制的高保真度視頻物件插入
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control
Yuanpeng Tu, Hao Luo, Xi Chen, Sihui Ji, Xiang Bai, Hengshuang Zhao
•
Jan 2, 2025
•
49
3
CodeElo:使用人類可比擬 Elo 等級對 LLMs 的競賽級程式碼生成進行基準測試
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
Shanghaoran Quan, Jiaxi Yang, Bowen Yu, Bo Zheng, Dayiheng Liu, An Yang, Xuancheng Ren, Bofei Gao, Yibo Miao, Yunlong Feng, Zekun Wang, Jian Yang, Zeyu Cui, Yang Fan, Yichang Zhang, Binyuan Hui, Junyang Lin
•
Jan 2, 2025
•
47
6
LTX-Video:即時影像潛在擴散
LTX-Video: Realtime Video Latent Diffusion
Yoav HaCohen, Nisan Chiprut, Benny Brazowski, Daniel Shalem, Dudu Moshe, Eitan Richardson, Eran Levin, Guy Shiran, Nir Zabari, Ori Gordon, Poriya Panet, Sapir Weissbuch, Victor Kulikov, Yaki Bitterman, Zeev Melumian, Ofir Bibi
•
Dec 30, 2024
•
41
3
VideoRefer 套件:透過 Video LLM 推進時空物件理解
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan, Hang Zhang, Wentong Li, Zesen Cheng, Boqiang Zhang, Long Li, Xin Li, Deli Zhao, Wenqiao Zhang, Yueting Zhuang, Jianke Zhu, Lidong Bing
•
Dec 31, 2024
•
41
2
重建 vs. 生成:在潛在擴散模型中馴服優化困境
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Jingfeng Yao, Xinggang Wang
•
Jan 2, 2025
•
36
2
ProgCo:程式協助大型語言模型自我修正
ProgCo: Program Helps Self-Correction of Large Language Models
Xiaoshuai Song, Yanan Wu, Weixun Wang, Jiaheng Liu, Wenbo Su, Bo Zheng
•
Jan 2, 2025
•
25
2
MLLM作為圖像安全的評判器,無需人工標註
MLLM-as-a-Judge for Image Safety without Human Labeling
Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu, Zhuowei Li, Ligong Han, Harihar Subramanyam, Li Chen, Jianfa Chen, Nan Jiang, Lingjuan Lyu, Shiqing Ma, Dimitris N. Metaxas, Ankit Jain
•
Dec 31, 2024
•
24
2
MapEval:基於地圖的基礎模型中地理空間推理的評估
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models
Mahir Labib Dihan, Md Tanvir Hassan, Md Tanvir Parvez, Md Hasebul Hasan, Md Almash Alam, Muhammad Aamir Cheema, Mohammed Eunus Ali, Md Rizwan Parvez
•
Dec 31, 2024
•
22
2
A3:用於行動GUI代理的Android代理競技場
A3: Android Agent Arena for Mobile GUI Agents
Yuxiang Chai, Hanhao Li, Jiayu Zhang, Liang Liu, Guozhi Wang, Shuai Ren, Siyuan Huang, Hongsheng Li
•
Jan 2, 2025
•
22
3
統一專業視覺編碼器用於視頻語言模型
Unifying Specialized Visual Encoders for Video Language Models
Jihoon Chung, Tyler Zhu, Max Gonzalez Saez-Diez, Juan Carlos Niebles, Honglu Zhou, Olga Russakovsky
•
Jan 2, 2025
•
21
2
代碼獎勵建模的單元測試動態調整
Dynamic Scaling of Unit Tests for Code Reward Modeling
Zeyao Ma, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang
•
Jan 2, 2025
•
17
2
嵌套注意力:語義感知注意力值用於概念個性化
Nested Attention: Semantic-aware Attention Values for Concept Personalization
Or Patashnik, Rinon Gal, Daniil Ostashev, Sergey Tulyakov, Kfir Aberman, Daniel Cohen-Or
•
Jan 2, 2025
•
11
2
SeedVR:在擴散Transformer中播種無限性,朝向通用視頻修復
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
Jianyi Wang, Zhijie Lin, Meng Wei, Yang Zhao, Ceyuan Yang, Chen Change Loy, Lu Jiang
•
Jan 2, 2025
•
11
2
MapQaTor:一個用於高效標註地圖查詢數據集的系統
MapQaTor: A System for Efficient Annotation of Map Query Datasets
Mahir Labib Dihan, Mohammed Eunus Ali, Md Rizwan Parvez
•
Dec 30, 2024
•
9
2
通過最近性和過度平滑的角度來理解和緩解狀態空間模型的瓶頸。
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pan Li
•
Dec 31, 2024
•
7
2
人口感知擴散用於時間序列生成
Population Aware Diffusion for Time Series Generation
Yang Li, Han Meng, Zhenyu Bi, Ingolv T. Urnes, Haipeng Chen
•
Jan 1, 2025
•
6
2
透過情境化等變位位置編碼重新思考語言模型中的地址處理
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Jiajun Zhu, Peihao Wang, Ruisi Cai, Jason D. Lee, Pan Li, Zhangyang Wang
•
Jan 1, 2025
•
6
4
SeFAR:具有時間擾動和學習穩定化的半監督細粒度動作識別
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization
Yongle Huang, Haodong Chen, Zhenbang Xu, Zihan Jia, Haozhou Sun, Dian Shao
•
Jan 2, 2025
•
5
2