ChatPaper.ai
메뉴 열기
홈
오늘의 논문
대시보드
요금제
계정
🇰🇷
한국어
Loading...
•
•
•
•
•
•
•
•
•
•
AI 연구 논문 데일리
번역이 포함된 일일 선별된 AI 연구 논문
October 18th, 2024
영화 Gen: 미디어 기반 모델 캐스트
Movie Gen: A Cast of Media Foundation Models
Adam Polyak, Amit Zohar, Andrew Brown, Andros Tjandra, Animesh Sinha, Ann Lee, Apoorv Vyas, Bowen Shi, Chih-Yao Ma, Ching-Yao Chuang, David Yan, Dhruv Choudhary, Dingkang Wang, Geet Sethi, Guan Pang, Haoyu Ma, Ishan Misra, Ji Hou, Jialiang Wang, Kiran Jagadeesh, Kunpeng Li, Luxin Zhang, Mannat Singh, Mary Williamson, Matt Le, Matthew Yu, Mitesh Kumar Singh, Peizhao Zhang, Peter Vajda, Quentin Duval, Rohit Girdhar, Roshan Sumbaly, Sai Saketh Rambhatla, Sam Tsai, Samaneh Azadi, Samyak Datta, Sanyuan Chen, Sean Bell, Sharadh Ramaswamy, Shelly Sheynin, Siddharth Bhattacharya, Simran Motwani, Tao Xu, Tianhe Li, Tingbo Hou, Wei-Ning Hsu, Xi Yin, Xiaoliang Dai, Yaniv Taigman, Yaqiao Luo, Yen-Cheng Liu, Yi-Chiao Wu, Yue Zhao, Yuval Kirstain, Zecheng He, Zijian He, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu, Arun Mallya, Baishan Guo, Boris Araya, Breena Kerr, Carleigh Wood, Ce Liu, Cen Peng, Dimitry Vengertsev, Edgar Schonfeld, Elliot Blanchard, Felix Juefei-Xu, Fraylie Nord, Jeff Liang, John Hoffman, Jonas Kohler, Kaolin Fire, Karthik Sivakumar, Lawrence Chen, Licheng Yu, Luya Gao, Markos Georgopoulos, Rashel Moritz, Sara K. Sampson, Shikai Li, Simone Parmeggiani, Steve Fine, Tara Fowler, Vladan Petrovic, Yuming Du
•
Oct 17, 2024
•
88
2
MixEval-X: 현실 세계 데이터 혼합물로부터의 모든-모든 평가
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Jinjie Ni, Yifan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Shieh
•
Oct 17, 2024
•
74
2
판사벤치: LLM 기반 판사 평가를 위한 벤치마크
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Sijun Tan, Siyuan Zhuang, Kyle Montgomery, William Y. Tang, Alejandro Cuadron, Chenguang Wang, Raluca Ada Popa, Ion Stoica
•
Oct 16, 2024
•
42
2
액체: 연속 토큰을 사용한 자기회귀 텍스트-이미지 생성 모델의 확장
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan, Tianhong Li, Siyang Qin, Yuanzhen Li, Chen Sun, Michael Rubinstein, Deqing Sun, Kaiming He, Yonglong Tian
•
Oct 17, 2024
•
35
3
대형 언어 모델을 활용한 초인간 수준의 음성 이해를 향한 로드맵
Roadmap towards Superhuman Speech Understanding using Large Language Models
Fan Bu, Yuhao Zhang, Xidong Wang, Benyou Wang, Qun Liu, Haizhou Li
•
Oct 17, 2024
•
33
2
효율적인 모바일 작업 자동화를 위한 이차 에이전트 시스템 MobA
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Zichen Zhu, Hao Tang, Yansi Li, Kunyao Lan, Yixuan Jiang, Hao Zhou, Yixiao Wang, Situo Zhang, Liangtai Sun, Lu Chen, Kai Yu
•
Oct 17, 2024
•
31
3
WorldCuisines: 글로벌 요리에 대한 다국어 및 다문화 시각 질의 응답을 위한 대규모 벤치마크
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Genta Indra Winata, Frederikus Hudi, Patrick Amadeus Irawan, David Anugraha, Rifki Afina Putri, Yutong Wang, Adam Nohejl, Ubaidillah Ariq Prathama, Nedjma Ousidhoum, Afifa Amriani, Anar Rzayev, Anirban Das, Ashmari Pramodya, Aulia Adila, Bryan Wilie, Candy Olivia Mawalim, Ching Lam Cheng, Daud Abolade, Emmanuele Chersoni, Enrico Santus, Fariz Ikhwantri, Garry Kuwanto, Hanyang Zhao, Haryo Akbarianto Wibowo, Holy Lovenia, Jan Christian Blaise Cruz, Jan Wira Gotama Putra, Junho Myung, Lucky Susanto, Maria Angelica Riera Machin, Marina Zhukova, Michael Anugraha, Muhammad Farid Adilazuarda, Natasha Santosa, Peerat Limkonchotiwat, Raj Dabre, Rio Alexander Audino, Samuel Cahyawijaya, Shi-Xiong Zhang, Stephanie Yulia Salim, Yi Zhou, Yinxuan Gui, David Ifeoluwa Adelani, En-Shiun Annie Lee, Shogo Okada, Ayu Purwarianti, Alham Fikri Aji, Taro Watanabe, Derry Tanti Wijaya, Alice Oh, Chong-Wah Ngo
•
Oct 16, 2024
•
29
3
텍스트 풍부한 시각 이해를 위한 웹페이지 UI 활용
Harnessing Webpage UIs for Text-Rich Visual Understanding
Junpeng Liu, Tianyue Ou, Yifan Song, Yuxiao Qu, Wai Lam, Chenyan Xiong, Wenhu Chen, Graham Neubig, Xiang Yue
•
Oct 17, 2024
•
29
2
자누스: 통합된 다중 모달 이해와 생성을 위한 시각 인코딩 분리
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, Ping Luo
•
Oct 17, 2024
•
27
4
DreamVideo-2: 정확한 동작 제어를 통한 제로샷 주제 중심 비디오 맞춤화
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan
•
Oct 17, 2024
•
23
2
MoH: 다중 헤드 어텐션을 헤드 혼합 어텐션으로 번역합니다.
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin, Bo Zhu, Li Yuan, Shuicheng Yan
•
Oct 15, 2024
•
20
2
MMed-RAG: 의료 비전 언어 모델을 위한 다재다능한 다중 모달 RAG 시스템
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Peng Xia, Kangyu Zhu, Haoran Li, Tianze Wang, Weijia Shi, Sheng Wang, Linjun Zhang, James Zou, Huaxiu Yao
•
Oct 16, 2024
•
20
3
BenTo: 맥락 전이를 통한 벤치마크 과제 축소
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao, Ming Li, Lichao Sun, Tianyi Zhou
•
Oct 17, 2024
•
19
3
PopAlign: 보다 포괄적인 정렬을 위한 다양한 대조 패턴화
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Zekun Moore Wang, Shawn Wang, Kang Zhu, Jiaheng Liu, Ke Xu, Jie Fu, Wangchunshu Zhou, Wenhao Huang
•
Oct 17, 2024
•
18
2
OpenAI의 o1 모델의 추론 패턴에 대한 비교 연구
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Siwei Wu, Zhongyuan Peng, Xinrun Du, Tuney Zheng, Minghao Liu, Jialong Wu, Jiachen Ma, Yizhi Li, Jian Yang, Wangchunshu Zhou, Qunshu Lin, Junbo Zhao, Zhaoxiang Zhang, Wenhao Huang, Ge Zhang, Chenghua Lin, J. H. Liu
•
Oct 17, 2024
•
16
2
사전 훈련된 대규모 모델에서 델타 매개변수 편집의 통합된 관점
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Qiaoyu Tang, Le Yu, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun
•
Oct 17, 2024
•
14
2
LLM은 정치적으로 올바른가? AI 시스템에서의 윤리적 편향과 탈옥 취약점 분석
Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems
Isack Lee, Haebin Seong
•
Oct 17, 2024
•
12
2
비드파노스: 캐주얼 패닝 비디오로부터 생성된 파노라마 비디오
VidPanos: Generative Panoramic Videos from Casual Panning Videos
Jingwei Ma, Erika Lu, Roni Paiss, Shiran Zada, Aleksander Holynski, Tali Dekel, Brian Curless, Michael Rubinstein, Forrester Cole
•
Oct 17, 2024
•
12
2
FlatQuant: LLM 양자화에 있어서 평탄함이 중요합니다.
FlatQuant: Flatness Matters for LLM Quantization
Yuxuan Sun, Ruikang Liu, Haoli Bai, Han Bao, Kang Zhao, Yuening Li, Jiaxin Hu, Xianzhi Yu, Lu Hou, Chun Yuan, Xin Jiang, Wulong Liu, Jun Yao
•
Oct 12, 2024
•
12
2
전진하는 실패: 합성 데이터와 검색 보강을 활용한 ASR을 위한 생성적 오류 교정 개선
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li
•
Oct 17, 2024
•
9
2
MLLMs가 중국 이미지의 심층적 함의를 이해할 수 있을까요?
Can MLLMs Understand the Deep Implication Behind Chinese Images?
Chenhao Zhang, Xi Feng, Yuelin Bai, Xinrun Du, Jinchang Hou, Kaixin Deng, Guangzeng Han, Qinrui Li, Bingli Wang, Jiaheng Liu, Xingwei Qu, Yifei Zhang, Qixuan Zhao, Yiming Liang, Ziqiang Liu, Feiteng Fang, Min Yang, Wenhao Huang, Chenghua Lin, Ge Zhang, Shiwen Ni
•
Oct 17, 2024
•
8
2
기억, 검색 및 생성: 무한한 시각적 개념을 이해하는 개인화된 비서로서
Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant
Haoran Hao, Jiaming Han, Changsheng Li, Yu-Feng Li, Xiangyu Yue
•
Oct 17, 2024
•
8
2
MedMobile: 전문 수준의 임상 능력을 갖춘 휴대용 규모의 언어 모델
MedMobile: A mobile-sized language model with expert-level clinical capabilities
Krithik Vishwanath, Jaden Stryker, Anton Alaykin, Daniel Alexander Alber, Eric Karl Oermann
•
Oct 11, 2024
•
8
2
상호 작용으로부터의 회고적 학습
Retrospective Learning from Interactions
Zizhao Chen, Mustafa Omer Gul, Yiwei Chen, Gloria Geng, Anne Wu, Yoav Artzi
•
Oct 17, 2024
•
8
2
γ-MoD: 다중 모달 대규모 언어 모델을 위한 깊이 혼합 적응 탐구
γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Yaxin Luo, Gen Luo, Jiayi Ji, Yiyi Zhou, Xiaoshuai Sun, Zhiqiang Shen, Rongrong Ji
•
Oct 17, 2024
•
7
2
MuVi: 시멘틱 정렬과 리듬 동기화를 이용한 비디오에서 음악 생성
MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization
Ruiqi Li, Siqi Zheng, Xize Cheng, Ziang Zhang, Shengpeng Ji, Zhou Zhao
•
Oct 16, 2024
•
7
2
2024년 오픈 소스 무기물 데이터셋 및 모델 (OMat24)
Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models
Luis Barroso-Luque, Muhammed Shuaibi, Xiang Fu, Brandon M. Wood, Misko Dzamba, Meng Gao, Ammar Rizvi, C. Lawrence Zitnick, Zachary W. Ulissi
•
Oct 16, 2024
•
6
1
LoLDU: 하위-대각-상부 분해를 통한 저랭크 적응을 통한 매개변수 효율적인 미세 조정
LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning
Yiming Shi, Jiwei Wei, Yujia Wu, Ran Ran, Chengwei Sun, Shiyuan He, Yang Yang
•
Oct 17, 2024
•
6
2
Long-LRM: 넓은 범위를 위한 장거리 대형 재구성 모델을 위한 가우시안 스플랫
Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
Chen Ziwen, Hao Tan, Kai Zhang, Sai Bi, Fujun Luan, Yicong Hong, Li Fuxin, Zexiang Xu
•
Oct 16, 2024
•
5
2
조건 대조 정렬을 통한 가이드 없는 AR 시각 생성 방향으로
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Huayu Chen, Hang Su, Peize Sun, Jun Zhu
•
Oct 12, 2024
•
4
2
AERO: 효율적인 개인 정보 추론을 위한 Softmax 전용 LLMs
AERO: Softmax-Only LLMs for Efficient Private Inference
Nandan Kumar Jha, Brandon Reagen
•
Oct 16, 2024
•
4
2
고품질 데이터를 활용하여 LLMs로부터 긴 출력을 얻기 위한 최소한의 조정
Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key
Yingda Chen, Xingjun Wang, Jintao Huang, Yunlin Mao, Daoze Zhang, Yuze Zhao
•
Oct 14, 2024
•
3
2
TransAgent: 이질적 에이전트 협업을 통한 비전-언어 기반 모델 전이
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
Yiwei Guo, Shaobin Zhuang, Kunchang Li, Yu Qiao, Yali Wang
•
Oct 16, 2024
•
3
2
SBI-RAG: 스키마 기반 지시와 검색 증강 생성을 통해 학생들의 수학 워드 문제 해결 능력 향상
SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation
Prakhar Dixit, Tim Oates
•
Oct 17, 2024
•
2
2