ChatPaper.ai
打開菜單
首頁
每日論文
arXiv
HuggingFace
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
October 24th, 2024
MIA-DPO:用於大型視覺語言模型的多圖像增強直接偏好優化
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Haodong Duan, Conghui He, Yuanjun Xiong, Dahua Lin, Jiaqi Wang
•
Oct 23, 2024
•
37
3
LongVU:針對長視頻語言理解的時空自適應壓縮
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding
Xiaoqian Shen, Yunyang Xiong, Changsheng Zhao, Lemeng Wu, Jun Chen, Chenchen Zhu, Zechun Liu, Fanyi Xiao, Balakrishnan Varadarajan, Florian Bordes, Zhuang Liu, Hu Xu, Hyunwoo J. Kim, Bilge Soran, Raghuraman Krishnamoorthi, Mohamed Elhoseiny, Vikas Chandra
•
Oct 22, 2024
•
29
2
WorldSimBench:朝向將視頻生成模型打造成世界模擬器
WorldSimBench: Towards Video Generation Models as World Simulators
Yiran Qin, Zhelun Shi, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao, Lei Bai, Wanli Ouyang, Ruimao Zhang
•
Oct 23, 2024
•
20
2
通過從自回歸模型進行調適來擴展擴散語言模型
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong, Shivam Agarwal, Yizhe Zhang, Jiacheng Ye, Lin Zheng, Mukai Li, Chenxin An, Peilin Zhao, Wei Bi, Jiawei Han, Hao Peng, Lingpeng Kong
•
Oct 23, 2024
•
16
2
可擴展的排名偏好優化在文本到圖像生成中的應用
Scalable Ranked Preference Optimization for Text-to-Image Generation
Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata, Sergey Tulyakov, Jian Ren, Anil Kag
•
Oct 23, 2024
•
15
2
動態城市:從動態場景生成大規模LiDAR
DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes
Hengwei Bian, Lingdong Kong, Haozhe Xie, Liang Pan, Yu Qiao, Ziwei Liu
•
Oct 23, 2024
•
14
2
M-RewardBench:在多語言環境中評估獎勵模型
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja, Lester James V. Miranda, Shayekh Bin Islam, Rishabh Maheshwary, Drishti Sharma, Gusti Winata, Nathan Lambert, Sebastian Ruder, Sara Hooker, Marzieh Fadaee
•
Oct 20, 2024
•
12
3
輕量級神經應用控制
Lightweight Neural App Control
Filippos Christianos, Georgios Papoudakis, Thomas Coste, Jianye Hao, Jun Wang, Kun Shao
•
Oct 23, 2024
•
10
2
TP-Eval:通過定製提示來發揮多模態LLM的評估潛力
TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts
Yuxuan Xie, Tianhua Li, Wenqi Shao, Kaipeng Zhang
•
Oct 23, 2024
•
7
1
ARKit標籤生成器:室內3D場景理解的新尺度
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum
•
Oct 17, 2024
•
7
2
MedINST:生物醫學指令的元數據集
MedINST: Meta Dataset of Biomedical Instructions
Wenhan Han, Meng Fang, Zihan Zhang, Yu Yin, Zirui Song, Ling Chen, Mykola Pechenizkiy, Qingyu Chen
•
Oct 17, 2024
•
7
2
LVSM:具有最小3D歸納偏差的大視圖合成模型
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
Haian Jin, Hanwen Jiang, Hao Tan, Kai Zhang, Sai Bi, Tianyuan Zhang, Fujun Luan, Noah Snavely, Zexiang Xu
•
Oct 22, 2024
•
5
2
引導您的通才:通過價值指導改善機器人基礎模型
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto, Oier Mees, Aviral Kumar, Sergey Levine
•
Oct 17, 2024
•
2
1