ChatPaper.ai
打開菜單
首頁
每日論文
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
November 21st, 2024
SymDPO:通過符號示範直接偏好優化,提升大型多模態模型的上下文學習
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang
•
Nov 17, 2024
•
11
3
SageAttention2 技術報告:精確的 4 位元注意力機制 用於即插即用推論加速
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang, Haofeng Huang, Pengle Zhang, Jia Wei, Jun Zhu, Jianfei Chen
•
Nov 17, 2024
•
52
9
VBench++:用於視頻生成模型的全面且多功能的基準套件
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
•
Nov 20, 2024
•
30
3
VideoAutoArena:一個用於透過使用者模擬評估大型多模式模型在視頻分析中的自動化競技場
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
Ziyang Luo, Haoning Wu, Dongxu Li, Jing Ma, Mohan Kankanhalli, Junnan Li
•
Nov 20, 2024
•
18
5
SAMURAI:適應於零樣本視覺追蹤的Segment Anything模型,並搭載運動感知記憶功能。
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory
Cheng-Yen Yang, Hsiang-Wei Huang, Wenhao Chai, Zhongyu Jiang, Jenq-Neng Hwang
•
Nov 18, 2024
•
18
3
當精確度遇上位置:BFloat16 在長文本訓練中突破 RoPE
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Haonan Wang, Qian Liu, Chao Du, Tongyao Zhu, Cunxiao Du, Kenji Kawaguchi, Tianyu Pang
•
Nov 20, 2024
•
15
2
你的LLM暗中是互聯網的世界模型嗎?基於模型的計劃為Web代理商
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Yu Gu, Boyuan Zheng, Boyu Gou, Kai Zhang, Cheng Chang, Sanjari Srivastava, Yanan Xie, Peng Qi, Huan Sun, Yu Su
•
Nov 10, 2024
•
13
2
風格碼:將風格資訊編碼用於影像生成
Stylecodes: Encoding Stylistic Information For Image Generation
Ciara Rowles
•
Nov 19, 2024
•
11
2
ViBe:一個用於評估大型多模型中幻覺的文本到視頻基準測試
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
Vipula Rawte, Sarthak Jain, Aarush Sinha, Garv Kaushik, Aman Bansal, Prathiksha Rumale Vishwanath, Samyak Rajesh Jain, Aishwarya Naresh Reganti, Vinija Jain, Aman Chadha, Amit P. Sheth, Amitava Das
•
Nov 16, 2024
•
7
4
損失對損失預測:所有數據集的比例定律
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener, Nikhil Anand, Nikhil Vyas, Eran Malach, Sham Kakade
•
Nov 19, 2024
•
5
2
通過文本到圖像 RGBA 實例生成生成組合場景
Generating Compositional Scenes via Text-to-image RGBA Instance Generation
Alessandro Fontanella, Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang, Sarah Parisot
•
Nov 16, 2024
•
3
2
ORID:器官區域資訊驅動的放射學報告生成框架
ORID: Organ-Regional Information Driven Framework for Radiology Report Generation
Tiancheng Gu, Kaicheng Yang, Xiang An, Ziyong Feng, Dongnan Liu, Weidong Cai
•
Nov 20, 2024
•
2
2