GPS作为图像生成的控制信号
GPS as a Control Signal for Image Generation
January 21, 2025
作者: Chao Feng, Ziyang Chen, Aleksander Holynski, Alexei A. Efros, Andrew Owens
cs.AI
摘要
我们展示了照片元数据中包含的GPS标签为图像生成提供了有用的控制信号。我们训练了GPS到图像的模型,并将其用于需要对城市内图像变化进行细粒度理解的任务。特别是,我们训练了一个扩散模型,以GPS和文本为条件生成图像。学习的模型生成捕捉不同街区、公园和地标的独特外观的图像。我们还通过得分蒸馏采样从2D GPS到图像模型中提取3D模型,利用GPS条件来约束从每个视角重建的外观。我们的评估表明,我们的GPS条件模型成功学习生成基于位置变化的图像,并且GPS条件改善了估计的3D结构。
English
We show that the GPS tags contained in photo metadata provide a useful
control signal for image generation. We train GPS-to-image models and use them
for tasks that require a fine-grained understanding of how images vary within a
city. In particular, we train a diffusion model to generate images conditioned
on both GPS and text. The learned model generates images that capture the
distinctive appearance of different neighborhoods, parks, and landmarks. We
also extract 3D models from 2D GPS-to-image models through score distillation
sampling, using GPS conditioning to constrain the appearance of the
reconstruction from each viewpoint. Our evaluations suggest that our
GPS-conditioned models successfully learn to generate images that vary based on
location, and that GPS conditioning improves estimated 3D structure.Summary
AI-Generated Summary