ChatPaper.aiChatPaper

Daily Papers

Position: Interactive Generative Video as Next-Generation Game Engine

Jiwen Yu, Yiran Qin, Haoxuan Che, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Xihui LiuMar 21, 2025421

Video-T1: Test-Time Scaling for Video Generation

Fangfu Liu, Hanyang Wang, Yimo Cai, Kaiyan Zhang, Xiaohang Zhan, Yueqi DuanMar 24, 2025351

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian HeMar 24, 2025101

Training-free Diffusion Acceleration with Bottleneck Sampling

Ye Tian, Xin Xia, Yuxi Ren, Shanchuan Lin, Xing Wang, Xuefeng Xiao, Yunhai Tong, Ling Yang, Bin CuiMar 24, 202581

Judge Anything: MLLM as a Judge Across Any Modality

Shu Pu, Yaochen Wang, Dongping Chen, Yuhang Chen, Guohao Wang, Qi Qin, Zhongyi Zhang, Zhiyuan Zhang, Zetong Zhou, Shuang Gong, Yi Gui, Yao Wan, Philip S. YuMar 21, 202581

Aether: Geometric-Aware Unified World Modeling

Aether Team, Haoyi Zhu, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Chunhua Shen, Jiangmiao Pang, Tong HeMar 24, 202561

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

Meng Cao, Pengfei Hu, Yingyao Wang, Jihao Gu, Haoran Tang, Haoze Zhao, Jiahua Dong, Wangbo Yu, Ge Zhang, Ian Reid, Xiaodan LiangMar 24, 202561

LEMMA: Learning from Errors for MatheMatical Advancement in LLMs

Zhuoshi Pan, Yu Li, Honglin Lin, Qizhi Pei, Zinan Tang, Wei Wu, Chenlin Ming, H. Vicky Zhao, Conghui He, Lijun WuMar 21, 202561

Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering

DongGeon Lee, Ahjeong Park, Hyeri Lee, Hyeonseo Nam, Yunho MaengMar 20, 202551

Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning

Yufei Zhan, Yousong Zhu, Shurong Zheng, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao WangMar 23, 202531

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Akhiad Bercovich, Mohammad Dabbah, Omri Puny, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Ehud Karpas, Itay Levy, Zach Moshe, Najeeb Nabwani, Tomer Ronen, Itamar Schen, Elad Segal, Ido Shahaf, Oren Tropp, Ran Zilberstein, Ran El-YanivMar 24, 202521

Reasoning to Learn from Latent Thoughts

Yangjun Ruan, Neil Band, Chris J. Maddison, Tatsunori HashimotoMar 24, 202521

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

Hongyu Zhang, Yufan Deng, Shenghai Yuan, Peng Jin, Zesen Cheng, Yian Zhao, Chang Liu, Jie ChenMar 18, 202521

AMD-Hummingbird: Towards an Efficient Text-to-Video Model

Takashi Isobe, He Cui, Dong Zhou, Mengmeng Ge, Dong Li, Emad BarsoumMar 24, 202511

RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation

Zhiqiang Yuan, Ting Zhang, Ying Deng, Jiapei Zhang, Yeshuang Zhu, Zexi Jia, Jie Zhou, Jinchao ZhangMar 22, 202511

Variance Control via Weight Rescaling in LLM Pre-training

Louis Owen, Abhay Kumar, Nilabhra Roy Chowdhury, Fabian GüraMar 21, 202511

V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms

Javier J. Poveda Rodrigo, Mohamed Amine Ahmdi, Alessio Burrello, Daniele Jahier Pagliari, Luca BeniniMar 21, 202511

Optimized Minimal 3D Gaussian Splatting

Joo Chan Lee, Jong Hwan Ko, Eunbyung ParkMar 21, 202501

Defeating Prompt Injections by Design

Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, Florian TramèrMar 24, 202501