上海期智研究院PI,清华大学交叉信息研究院助理教授。
于美国加州大学伯克利分校获得博士学位,师从Trevor Darrell教授。在获得博士学位后,于加州伯克利大学与Pieter Abbeel等人合作完成了博士后研究。研究方向为强化学习与机器人学。高阳博士目前主持具身视觉与机器人实验室 (Embodied Vision and Robotics,简称EVAR Lab),专注于利用人工智能技术赋能机器人,致力于打造通用的具身智能框架。
个人荣誉
北京市青年托举计划
机器人:研究通用机器人的算法
强化学习:高样本效率、现实世界的强化学习
成果2:针对机器人操控的通用“语义-几何表征”
机器人在感知和与世界交互时高度依赖传感器,尤其是RGB摄像头和深度摄像头。RGB摄像头记录了具有丰富语义信息的2D图像,但缺失了精确的空间信息;另一方面,深度摄像头提供了关键的3D几何数据,但捕获的语义信息有限。因此,整合这两种模式对于机器人感知和控制的表征是至关重要的。然而,当前的研究主要集中在其中一种模式上,忽视了结合两者的好处。
为了解决这个问题,高阳团队提出了“语义-几何表征”(Semantic-Geometric Representation, SGR),这是一个针对机器人的通用感知模块,它不仅充分利用了大规模预训练2D模型中的丰富语义信息,还结合了3D空间的推理优势。我们的实验结果显示,“语义-几何表征”使机器人能够出色地应对各类仿真与实际场景中的任务,无论是单任务还是多任务情境,其表现都超过之前最先进的方法。值得一提的是,“语义-几何表征”在处理新的语义属性上展现了出色的泛化能力,这一特性也使其显著区别于其他技术。
研究领域:机器人的视觉表征
项目网站: https://semantic-geometric-representation.github.io/
研究论文: Zhang, Tong, Yingdong Hu, Hanchen Cui, Hang Zhao, and Yang Gao. ‘A Universal Semantic-Geometric Representation for Robotic Manipulation’. In Conference on Robot Learning (CoRL), 2023. 查看PDF
------------------------------------------------------------------------------------------------------------------------------
成果1: 高效率强化学习
强化学习被认为是实现通用人工智能的重要路线之一。但强化学习往往需要海量的数据,难以在物理世界中应用。高阳团队提出一种新型强化学习算法EfficientZero,可仅用少量数据情况下取得高性能。团队针对蒙特卡洛强化学习算法的训练信号弱、模型累计误差和无法有效利用离线数据的三大问题,提出了系统化解决方法。提出了时序自监督学习、前缀值函数拟合和自适应值函数修正等算法创新,最终在仅使用2小时真实世界数据情况下,在Atari基准测试上达到了人类性能的109%。这是强化学习算法首次在有限数据情况下达到超越人类的能力。该算法的样本效率首次超越了人类的水平,达到了谷歌提出的经典强化学习算法DQN数据效率的近600倍。EfficientZero算法解决了强化学习领域的重大基础科学问题,填补了国际高效率强化学习领域的空白,为强化学习的物理世界落地铺平了道路。
在EfficientZero之上,为了更加容易地在物理世界使用强化学习算法,高阳团队开发了EfficientImitate算法,可以在不使用奖励函数情况下进行高效率学习;同时也开发了Virtual MCTS大幅度提升EfficientZero的计算效率。
36. Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation, Tong Zhang, Yingdong Hu, Jiacheng You, Yang Gao†, https://sgrv2-robot.github.io/, CoRL 2024.
35. General Flow as Foundation Affordance for Scalable Robot Learning, Chengbo Yuan, Chuan Wen, Tong Zhang, Yang Gao†, https://general-flow.github.io/, CoRL 2024.
34. DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands, Fengbo Lan*, Shengjie Wang*, Yunzhe Zhang, Haotian Xu, Oluwatosin Oseni, Yang Gao†, Tao Zhang†, https://dexcatch.github.io/, CoRL 2024.
33. Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own, Weirui Ye, Yunsheng Zhang, Haoyang Weng, Xianfan Gu, Shengjie Wang, Tong Zhang, Mengchen Wang, Pieter Abbeel, Yang Gao†, https://arxiv.org/pdf/2310.02635, CoRL 2024 ( Oral ).
32. MQE: Unleashing the Power of Interaction with Multi-agent Quadruped Environment, Ziyan Xiong, Bo Chen, Shiyu Huang, Wei-wei Tu, Zhaofeng He, Yang Gao†, https://ziyanx02.github.io/multiagent-quadruped-environment/, IROS 2024.
31. CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models, Haoxu Huang, Fanqi Lin, Yingdong Hu, Shenjie Wang, Yang Gao†, https://copa-2024.github.io/, IROS 2024.
30. Any-point Trajectory Modeling for Policy Learning, Chuan Wen, Xingyu Lin, John So, Kai Chen, Qi Dou, Yang Gao, RSS 2024.
29. EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data, Shengjie Wang*, Shaohuai Liu*, Weirui Ye*, Jiacheng You, and Yang Gao, https://arxiv.org/abs/2403.00564, ICML 2024.
28. Yingdong Hu*, Fanqi Lin*, Tong Zhang, Li Yi, Yang Gao, Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning, ICRA 2024
27. Haoxu Huang*, Fanqi Lin*, Yangdong Hu, Shenjie Wang, Yang Gao., CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models, ICRA 2024
26. Yuyang Liu*, Weijun Dong*, Yingdong Hu, Chuan Wen, Zhao-Heng Yin, Chongjie Zhang, Yang Gao, Imitation Learning from Observation with Automatic Discount Scheduling. ICLR 2024
25. Chuan Wen, Dinesh Jayaraman, Yang Gao, Can Transformers Capture Spatial Relations between Objects?, ICLR 2024
24. Xianfan Gu, Chuan Wen, Weirui Ye, Jiaming Song, Yang Gao, Seer: Language Instructed Video Prediction with Latent Diffusion Models, ICLR 2024
23. Kaifeng Zhang, Rui Zhao, Ziming Zhang, Yang Gao, Auto-Encoding Adversarial Imitation Learning, International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2023 查看PDF
22. Tong Zhang, Yingdong Hu, Hanchen Cui, Hang Zhao, Yang Gao, A Universal Semantic-Geometric Representation for Robotic Manipulation, International Conference on Robots Learning (CORL), 2023 查看PDF
21. Shengjie Wang, Fengbo Lan, Xiang Zheng, Yuxue Cao, Oluwatosin Oseni, Haotian Xu, Tao Zhang, Yang Gao, A Policy Optimization Method Towards Optimal-time Stability, International Conference on Robots Learning (CORL), 2023 查看PDF
20. Jialei Huang, Zhaoheng Yin, Yingdong Hu, Yang Gao, Policy Contrastive Imitation Learning, International Conference on Machine Learning (ICML), 2023 查看PDF
19. Yingdong Hu, Renhao Wang, Li Erran Li, Yang Gao, For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal, International Conference on Machine Learning (ICML), 2023 查看PDF
18. Kaizhe Hu, Ray Chen Zheng, Yang Gao, Huazhe Xu, Decision Transformer under Random Frame Dropping, International Conference on Learning Representation (ICLR), 2023 查看PDF
17. Zhengrong Xue, Zhecheng Yuan, Jiashun Wang, Xueqian Wang, Yang Gao, Huazhe Xu, USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable Manipulation, International Conference on Robot Automation (ICRA), 2023 查看PDF
16. Yixuan Mei, Jiaxuan Gao, Weirui Ye, Shaohuai Liu, Yang Gao, Yi Wu, SpeedyZero: Mastering Atari with Limited Data and Time, International Conference on Learning Representation (ICLR), 2023 查看PDF
15. Jiaye Teng, Chuan Wen, Dinghuai Zhang, Yoshua Bengio, Yang Gao, Yang Yuan, Predictive Inference with Feature Conformal Prediction, International Conference on Learning Representation (ICLR), 2023 查看PDF
14. Weirui Ye, Yunsheng Zhang, Pieter Abbeel, Yang Gao, Become a Proficient Player with Limited Data through Watching Pure Videos, International Conference on Learning Representation (ICLR), 2023 查看PDF
13. Renhao Wang, Jiayuan Mao, Joy Hsu, Hang Zhao, Jiajun Wu, Yang Gao, Programmatically Grounded, Compositionally Generalizable Robotic Manipulation, International Conference on Learning Representation (ICLR), 2023 查看PDF
12. Chuan Wen, Jianing Qian, Jierui Lin, Jiaye Teng, Dinesh Jayaraman, Yang Gao, Fighting Fire with Fire: Avoiding DNN Shortcuts through Priming, International Conference on Machine Learning (ICML), 2022 查看PDF
11. Zhecheng Yuan, Zhengrong Xue, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, Huazhe Xu, Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning, Conference on Neural Information Processing Systems (NeurIPS), 2022 查看PDF
10. Jinkun Cao, Ruiqian Nai, Qing Yang, Jialei Huang, Yang Gao, An Empirical Study on Disentanglement of Negative-free Contrastive Learning, Neural Information Processing Systems (NeurIPS), 2022 查看PDF
9. Zhao-Heng Yin, Weirui Ye, Qifeng Chen, Yang Gao, Planning for Sample Efficient Imitation Learning, Neural Information Processing Systems (NeurIPS), 2022 查看PDF
8. Weirui Ye, Pieter Abbeel, Yang Gao, Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions, Neural Information Processing Systems (NeurIPS), 2022 查看PDF
7. Renhao Wang, Hang Zhao, Yang Gao, CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation, European Conference on Computer Vision (ECCV), 2022 查看PDF
6. Yingdong Hu, Renhao Wang, Kaifeng Zhang, Yang Gao, Semantic-Aware Fine-Grained Correspondence, European Conference on Computer Vision (ECCV), 2022 查看PDF
5. Chia-Chi Chuang, Donglin Yang, Chuan Wen, Yang Gao, Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction, European Conference on Computer Vision (ECCV), 2022 查看PDF
4. Chenyu Yang, Wanrong He, Yingqing Xu, and Yang Gao, EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer, European Conference on Computer Vision (ECCV), 2022 查看PDF
3. Chuan Wen*, Jierui Lin*, Jianing Qian, Yang Gao, Dinesh Jayaraman, Keyframe-Focused Visual Imitation Learning. International Conference on Machine Learning (ICML) , 2021 查看PDF
2. Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao, Mastering Atari Games with Limited Data Advances, Neural Information Processing Systems (NeurIPS), 2021 查看PDF
1. Chuan Wen*, Jierui Lin*, Trevor Darrell, Dinesh Jayaraman, Yang Gao, Fighting Copycat Agents in Behavioral Cloning from Observation Histories, Neural Information Processing Systems (NeurIPS), 2020 查看PDF