DynaMo：針對視覺運動控制的領域內動態預訓練

摘要

模仿學習已被證明是訓練複雜視覺運動策略的強大工具。然而，目前的方法通常需要數百至數千個專家示範，以應對高維度視覺觀察。這種低效率的主要原因之一是視覺表示主要是預先在域外數據上訓練，或者通過行為克隆目標直接訓練。在這項工作中，我們提出了DynaMo，一種新的域內自監督學習視覺表示方法。給定一組專家示範，我們共同學習一個潛在的逆動力學模型和一個正向動力學模型，預測圖像嵌入序列中的下一幀在潛在空間中的位置，無需增強、對比抽樣或訪問地面真實動作。重要的是，DynaMo 不需要任何域外數據，如互聯網數據集或跨域數據集。在六個模擬和真實環境套件中，我們展示了使用DynaMo學習的表示顯著改善了先前自監督學習目標和預訓練表示的下游模仿學習性能。使用DynaMo的收益適用於各種策略類別，如行為變換器、擴散策略、MLP和最近鄰。最後，我們對DynaMo的關鍵組件進行了消融實驗，並測量其對下游策略性能的影響。機器人視頻最好在https://dynamo-ssl.github.io 上觀看。

English

Imitation learning has proven to be a powerful tool for training complex visuomotor policies. However, current methods often require hundreds to thousands of expert demonstrations to handle high-dimensional visual observations. A key reason for this poor data efficiency is that visual representations are predominantly either pretrained on out-of-domain data or trained directly through a behavior cloning objective. In this work, we present DynaMo, a new in-domain, self-supervised method for learning visual representations. Given a set of expert demonstrations, we jointly learn a latent inverse dynamics model and a forward dynamics model over a sequence of image embeddings, predicting the next frame in latent space, without augmentations, contrastive sampling, or access to ground truth actions. Importantly, DynaMo does not require any out-of-domain data such as Internet datasets or cross-embodied datasets. On a suite of six simulated and real environments, we show that representations learned with DynaMo significantly improve downstream imitation learning performance over prior self-supervised learning objectives, and pretrained representations. Gains from using DynaMo hold across policy classes such as Behavior Transformer, Diffusion Policy, MLP, and nearest neighbors. Finally, we ablate over key components of DynaMo and measure its impact on downstream policy performance. Robot videos are best viewed at https://dynamo-ssl.github.io

DynaMo：針對視覺運動控制的領域內動態預訓練

DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control

摘要

Summary

Support

Support