通過人在迴路中強化學習實現精確靈巧的機器人操作
Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning
October 29, 2024
作者: Jianlan Luo, Charles Xu, Jeffrey Wu, Sergey Levine
cs.AI
摘要
強化學習(RL)對於實現複雜機器人操作技能的自主獲取具有巨大潛力,但在現實世界中實現這一潛力一直是一項具有挑戰性的任務。我們提出了一個基於人機互動的視覺強化學習系統,展示了在各種靈巧操作任務上令人印象深刻的表現,包括動態操作、精密組裝和雙臂協調。我們的方法整合了示範和人類校正、高效的RL算法以及其他系統級設計選擇,學習出在僅需1至2.5小時的訓練內實現接近完美成功率和快速週期時間的策略。我們展示了我們的方法在成功率上明顯優於模仿學習基準和先前的RL方法,平均成功率提高了2倍,執行速度快了1.8倍。通過大量實驗和分析,我們提供了有關我們方法有效性的見解,展示了它如何學習出適用於反應性和預測性控制策略的強健、適應性策略。我們的結果表明,RL確實可以在實際訓練時間內直接在現實世界中學習各種複雜的基於視覺的操作策略。我們希望這項工作能激發新一代學習型機器人操作技術的發展,造福於工業應用和研究進展。視頻和代碼可在我們的項目網站https://hil-serl.github.io/ 上找到。
English
Reinforcement learning (RL) holds great promise for enabling autonomous
acquisition of complex robotic manipulation skills, but realizing this
potential in real-world settings has been challenging. We present a
human-in-the-loop vision-based RL system that demonstrates impressive
performance on a diverse set of dexterous manipulation tasks, including dynamic
manipulation, precision assembly, and dual-arm coordination. Our approach
integrates demonstrations and human corrections, efficient RL algorithms, and
other system-level design choices to learn policies that achieve near-perfect
success rates and fast cycle times within just 1 to 2.5 hours of training. We
show that our method significantly outperforms imitation learning baselines and
prior RL approaches, with an average 2x improvement in success rate and 1.8x
faster execution. Through extensive experiments and analysis, we provide
insights into the effectiveness of our approach, demonstrating how it learns
robust, adaptive policies for both reactive and predictive control strategies.
Our results suggest that RL can indeed learn a wide range of complex
vision-based manipulation policies directly in the real world within practical
training times. We hope this work will inspire a new generation of learned
robotic manipulation techniques, benefiting both industrial applications and
research advancements. Videos and code are available at our project website
https://hil-serl.github.io/.Summary
AI-Generated Summary