Daily Papers
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising
Steps
Nanye Ma, Shangyuan Tong, Haolin Jia, Hexiang Hu, Yu-Chuan Su, Mingda Zhang, Xuan Yang, Yandong Li, Tommi Jaakkola, Xuhui Jia, Saining Xie•Jan 16, 2025•322
OmniThink: Expanding Knowledge Boundaries in Machine Writing through
Thinking
Zekun Xi, Wenbiao Yin, Jizhan Fang, Jialong Wu, Runnan Fang, Ningyu Zhang, Jiang Yong, Pengjun Xie, Fei Huang, Huajun Chen•Jan 16, 2025•292
Learnings from Scaling Visual Tokenizers for Reconstruction and
Generation
Philippe Hansen-Estruch, David Yan, Ching-Yao Chung, Orr Zohar, Jialiang Wang, Tingbo Hou, Tao Xu, Sriram Vishwanath, Peter Vajda, Xinlei Chen•Jan 16, 2025•193
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient
Simulators
Zhaocheng Liu, Quan Tu, Wen Ye, Yu Xiao, Zhishou Zhang, Hengfu Cui, Yalun Zhu, Qiang Ju, Shizheng Li, Jian Xie•Jan 16, 2025•144
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li•Jan 16, 2025•122
SynthLight: Portrait Relighting with Diffusion Model by Learning to
Re-render Synthetic Faces
Sumit Chaturvedi, Mengwei Ren, Yannick Hold-Geoffroy, Jingyuan Liu, Julie Dorsey, Zhixin Shu•Jan 16, 2025•112
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Karl Pertsch, Kyle Stachowicz, Brian Ichter, Danny Driess, Suraj Nair, Quan Vuong, Oier Mees, Chelsea Finn, Sergey Levine•Jan 16, 2025•112
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation
Hwan Heo, Jangyeong Kim, Seongyeong Lee, Jeong A Wi, Junyoung Choi, Sangjun Ahn•Jan 16, 2025•93
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating
Large Language Models
Jonathan Katzy, Razvan Mihai Popescu, Arie van Deursen, Maliheh Izadi•Jan 16, 2025•82
Do generative video models learn physical principles from watching
videos?
Saman Motamed, Laura Culp, Kevin Swersky, Priyank Jaini, Robert Geirhos•Jan 14, 2025•72
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
Kaiqu Liang, Haimin Hu, Ryan Liu, Thomas L. Griffiths, Jaime Fernández Fisac•Jan 15, 2025•72
AnyStory: Towards Unified Single and Multiple Subject Personalization in
Text-to-Image Generation
Junjie He, Yuxiang Tuo, Binghui Chen, Chongyang Zhong, Yifeng Geng, Liefeng Bo•Jan 16, 2025•62