ChatPaper.ai
打開菜單
首頁
每日論文
arXiv
HuggingFace
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
September 18th, 2024
OmniGen:統一圖像生成
OmniGen: Unified Image Generation
Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan, Xingrun Xing, Ruiran Yan, Shuting Wang, Tiejun Huang, Zheng Liu
•
Sep 17, 2024
•
115
7
NVLM:開放式前沿多模態LLM模型
NVLM: Open Frontier-Class Multimodal LLMs
Wenliang Dai, Nayeon Lee, Boxin Wang, Zhuoling Yang, Zihan Liu, Jon Barker, Tuomas Rintamaki, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
•
Sep 17, 2024
•
75
2
微調圖像條件擴散模型比你想像的更容易。
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Gonzalo Martin Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans, Bastian Leibe
•
Sep 17, 2024
•
31
2
Phidias:一個從文字、圖像和3D條件創建3D內容的生成模型,具有參考增強擴散。
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Zhenwei Wang, Tengfei Wang, Zexin He, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau
•
Sep 17, 2024
•
28
2
Promptriever:指令訓練的檢索器可像語言模型一樣進行提示。
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Orion Weller, Benjamin Van Durme, Dawn Lawrie, Ashwin Paranjape, Yuhao Zhang, Jack Hessel
•
Sep 17, 2024
•
24
2
EzAudio:利用高效擴散Transformer增強文本轉語音生成
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer
Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu
•
Sep 17, 2024
•
20
3
對量化指令調整的大型語言模型進行全面評估:高達405B的實驗分析
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B
Jemin Lee, Sihyeong Park, Jinse Kwon, Jihun Oh, Yongin Kwon
•
Sep 17, 2024
•
17
3
OSV:一步就足以產生高品質的影像到影片轉換
OSV: One Step is Enough for High-Quality Image to Video Generation
Xiaofeng Mao, Zhengkai Jiang, Fu-Yun Wang, Wenbing Zhu, Jiangning Zhang, Hao Chen, Mingmin Chi, Yabiao Wang
•
Sep 17, 2024
•
14
2
關於基於代理的模型中代理能力的限制
On the limits of agency in agent-based models
Ayush Chopra, Shashank Kumar, Nurullah Giray-Kuru, Ramesh Raskar, Arnau Quera-Bofarull
•
Sep 14, 2024
•
14
2
不連續地形中的敏捷連續跳躍
Agile Continuous Jumping in Discontinuous Terrains
Yuxiang Yang, Guanya Shi, Changyi Lin, Xiangyun Meng, Rosario Scalise, Mateo Guaman Castro, Wenhao Yu, Tingnan Zhang, Ding Zhao, Jie Tan, Byron Boots
•
Sep 17, 2024
•
12
2
SplatFields:用於稀疏3D和4D重建的神經高斯Splat
SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction
Marko Mihajlovic, Sergey Prokudin, Siyu Tang, Robert Maier, Federica Bogo, Tony Tung, Edmond Boyer
•
Sep 17, 2024
•
9
2
通過基於事實的歸因和學習拒絕來衡量和增強RAG中LLM的可信度。
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Maojia Song, Shang Hong Sim, Rishabh Bhardwaj, Hai Leong Chieu, Navonil Majumder, Soujanya Poria
•
Sep 17, 2024
•
7
2
基礎模型中的類人情感認知
Human-like Affective Cognition in Foundation Models
Kanishk Gandhi, Zoe Lynch, Jan-Philipp Fränken, Kayla Patterson, Sharon Wambu, Tobias Gerstenberg, Desmond C. Ong, Noah D. Goodman
•
Sep 18, 2024
•
6
2
單層可學習激活函數用於隱式神經表示(SL^{2}A-INR)
Single-Layer Learnable Activation for Implicit Neural Representation (SL^{2}A-INR)
Moein Heidari, Reza Rezaeian, Reza Azad, Dorit Merhof, Hamid Soltanian-Zadeh, Ilker Hacihaliloglu
•
Sep 17, 2024
•
5
2
PDMX:用於符號音樂處理的大規模公共領域MusicXML數據集
PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing
Phillip Long, Zachary Novack, Taylor Berg-Kirkpatrick, Julian McAuley
•
Sep 17, 2024
•
5
2
使用傅立葉科爾莫哥洛夫-阿諾德網絡的隱式神經表示
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks
Ali Mehrabian, Parsa Mojarad Adi, Moein Heidari, Ilker Hacihaliloglu
•
Sep 14, 2024
•
5
2