Daily Papers
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu, Yiran Qin, Haoxuan Che, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu•Mar 21, 2025•421
Video-T1: Test-Time Scaling for Video Generation
Fangfu Liu, Hanyang Wang, Yimo Cai, Kaiyan Zhang, Xiaohang Zhan, Yueqi Duan•Mar 24, 2025•351
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for
Open Base Models in the Wild
Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He•Mar 24, 2025•101
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian, Xin Xia, Yuxi Ren, Shanchuan Lin, Xing Wang, Xuefeng Xiao, Yunhai Tong, Ling Yang, Bin Cui•Mar 24, 2025•81
Judge Anything: MLLM as a Judge Across Any Modality
Shu Pu, Yaochen Wang, Dongping Chen, Yuhang Chen, Guohao Wang, Qi Qin, Zhongyi Zhang, Zhiyuan Zhang, Zetong Zhou, Shuang Gong, Yi Gui, Yao Wan, Philip S. Yu•Mar 21, 2025•81
Aether: Geometric-Aware Unified World Modeling
Aether Team, Haoyi Zhu, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Chunhua Shen, Jiangmiao Pang, Tong He•Mar 24, 2025•61
Video SimpleQA: Towards Factuality Evaluation in Large Video Language
Models
Meng Cao, Pengfei Hu, Yingyao Wang, Jihao Gu, Haoran Tang, Haoze Zhao, Jiahua Dong, Wangbo Yu, Ge Zhang, Ian Reid, Xiaodan Liang•Mar 24, 2025•61
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
Zhuoshi Pan, Yu Li, Honglin Lin, Qizhi Pei, Zinan Tang, Wei Wu, Chenlin Ming, H. Vicky Zhao, Conghui He, Lijun Wu•Mar 21, 2025•61
Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid
Question Answering
DongGeon Lee, Ahjeong Park, Hyeri Lee, Hyeonseo Nam, Yunho Maeng•Mar 20, 2025•51
Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models
via Vision-Guided Reinforcement Learning
Yufei Zhan, Yousong Zhu, Shurong Zheng, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang•Mar 23, 2025•31
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammad Dabbah, Omri Puny, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Ehud Karpas, Itay Levy, Zach Moshe, Najeeb Nabwani, Tomer Ronen, Itamar Schen, Elad Segal, Ido Shahaf, Oren Tropp, Ran Zilberstein, Ran El-Yaniv•Mar 24, 2025•21
Reasoning to Learn from Latent Thoughts
Yangjun Ruan, Neil Band, Chris J. Maddison, Tatsunori Hashimoto•Mar 24, 2025•21
MagicComp: Training-free Dual-Phase Refinement for Compositional Video
Generation
Hongyu Zhang, Yufan Deng, Shenghai Yuan, Peng Jin, Zesen Cheng, Yian Zhao, Chang Liu, Jie Chen•Mar 18, 2025•21
AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and
Symbolic Reasoning
Alan Dao, Dinh Bach Vu, Bui Quang Huy•Mar 24, 2025•11
AMD-Hummingbird: Towards an Efficient Text-to-Video Model
Takashi Isobe, He Cui, Dong Zhou, Mengmeng Ge, Dong Li, Emad Barsoum•Mar 24, 2025•11
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame
Animated Sticker Generation
Zhiqiang Yuan, Ting Zhang, Ying Deng, Jiapei Zhang, Yeshuang Zhu, Zexi Jia, Jie Zhou, Jinchao Zhang•Mar 22, 2025•11
Variance Control via Weight Rescaling in LLM Pre-training
Louis Owen, Abhay Kumar, Nilabhra Roy Chowdhury, Fabian Güra•Mar 21, 2025•11
V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V
Platforms
Javier J. Poveda Rodrigo, Mohamed Amine Ahmdi, Alessio Burrello, Daniele Jahier Pagliari, Luca Benini•Mar 21, 2025•11
Optimized Minimal 3D Gaussian Splatting
Joo Chan Lee, Jong Hwan Ko, Eunbyung Park•Mar 21, 2025•01
Defeating Prompt Injections by Design
Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, Florian Tramèr•Mar 24, 2025•01