上海期智研究院PI,清华大学交叉信息研究院助理教授。
博士毕业于美国加州大学伯克利分校,博后曾就职于美国斯坦福大学。研究方向为具身智能与机器人学、强化学习、模仿学习等。围绕具身人工智能的关键环节,系统性地研究了视觉深度强化学习、模仿学习和机器人操作,对解决具身人工智能领域中数据效率低和泛化能力弱等核心问题做出多项贡献。发表顶级会议论文五十余篇,代表性工作曾被MIT Tech Review,Stanford HAI等媒体报道。
具身智能与机器人学:机器人泛化灵巧操作和控制
强化学习:可泛化和高效率的强化学习算法
模仿学习:高效可泛化的模仿学习算法
成果4:大模型生成具身智能仿真
使用大语言模型生成仿真环境是一种技术,它可以帮助模拟虚拟世界和场景,以便进行各种实验、测试和训练。本文利用大语言模型可以产生详细的场景描述,包括物体的位置、形状等信息。模型可以生成逼真的描述,使得仿真环境更加真实和可信。大语言模型可以结合物理引擎,生成仿真环境中的物理行为。例如,模型可以生成物体的运动、碰撞、重力效应等,以及其他复杂的物理行为。大语言模型可以生成智能体的行为策略和决策。这些智能体可以根据环境中的情况进行学习和优化,以实现特定的目标或任务。
研究领域:具身智能、机器人操作
项目网站:https://liruiw.github.io/gensim/
研究论文:Lirui Wang, Yiyang Ling*, Zhecheng Yuan*, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang ‘GenSim: Generating Robotic Simulation Tasks via Large Language Models’. In CoRL workshop (best paper), 2023. 查看PDF
------------------------------------------------------------------------------------------------------------------------------
成果3:可泛化机器人操作和强化学习
机器人操作是具身人工智能中的重要任务。但传统的机器人任务往往缺乏泛化性,难以在开放世界中应用。许华哲团队提出基于三维视觉的机器人泛化操作算法USEEK和基于预训练模型的强化学习泛化算法PIE-G,可以在多样的环境和物理实体机器人上,取得强大的泛化性能。团队主要考虑了在机器人问题中的旋转等变性和和视觉多样性等挑战和特性,提出了系统化解决方法。提出了在视觉端自动无监督地学习旋转等变的关键点,在控制端采用大规模数据集预训练策略模型,最终可以达到类内物体的泛化,并且在视觉大幅变化的环境达到了世界最佳的强化学习泛化能力。算法解决了机器人操作和强化学习领域的重大基础科学问题,填补了国际可泛化强化学习领域的空白,为强化学习和机器人学习算法在开放物理世界落地铺平了道路。
与此同时,团队搭建了触觉传感器DTact, 从而使机器人可以获取多模态输入。在触觉的加持下,机器人操作可以进行更精细、更准确、更可靠,同时加泛化能力。
------------------------------------------------------------------------------------------------------------------------------
成果2:视觉强化学习泛化性基准平台
如何让我们训练得到的模型能够应用于广泛的任务环境中是深度学习中的核心问题。近年来视觉强化学习已经取得了瞩目的成绩, 有越来越多的技术已用来解决图像输入时带来的高维度和冗余信息多等问题。但是也有研究表明,训练得到的智能体易过拟合于环境。如何让智能体拥有更强的泛化能力也值得研究者去关注与思考。为了实现这一目标,本文提出了一种新型视觉强化学习基准泛化测试框架,致力于提高智能体的泛化能力,同时解决当前基准测试中存在的诸多不足。该框架具有以下特点:
(1) 多样化的任务环境:本文所提出的基准测试包含了各种不同类型的任务,例如导航、自动驾驶、抓取、控制等,以便更全面地评估智能体在不同场景下的泛化能力。
(2) 高度现实世界仿真:本文的基准测试环境采用了高度现实的物理和视觉仿真引擎和环境,以便更接近现实世界的情境,从而提高算法在实际应用中的表现的潜力。
(3) 连续和高维度动作空间:本文的基准测试支持连续和高维度的动作空间,以便更好地评估算法在复杂动作控制任务中的性能。
(4) 新的评估指标:本文引入了新的评估指标,以更全面地评估算法的泛化性能。
研究领域:视觉强化学习泛化基准平台
项目网站:https://gemcollector.github.io/RL-ViGen/
研究论文:Zhecheng Yuan*, Sizhe Yang*, Pu Hua, Can Chang, Kaizhe Hu, Xiaolong Wang, Huazhe Xu. ‘RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization’. In NeurIPS, 2023.
------------------------------------------------------------------------------------------------------------------------------
成果1:无标记物可力感知的视触觉传感器9DTact
9DTact是一种触觉传感器技术,它具有两个关键特点:
1凝胶感知:9DTact传感器的主要特点是其凝胶表面。该表面由一个透明的弹性凝胶材料覆盖,类似于人类的皮肤。凝胶表面可以通过接触物体来感知微小的形状和纹理信息,并以高分辨率记录这些信息。当传感器与物体接触时,凝胶会略微变形,这种变形的形状变化可以被传感器中的摄像机系统捕捉到。
2视觉反馈:9DTact传感器结合了视觉反馈和触觉感知。传感器内部包含一个内置的摄像机系统,用于记录凝胶表面的形变情况。摄像机可以捕捉到物体在凝胶表面上产生的形状变化,并将其转化为数字图像。通过分析这些图像,可以获取关于物体形状、纹理和表面细节的信息。
与已有方案相比,9DTact无需任何标记物可以同时感知纹理和力觉。
研究领域:具身智能、机器人触觉
项目网站:https://linchangyi1.github.io/9DTact/
研究论文:Changyi Lin, Han Zhang, Jikai Xu, Lei Wu, Huazhe Xu, ‘9DTact: A Compact Vision-Based Tactile Sensor for Accurate 3D Shape Reconstruction and Generalizable 6D Force Estimation’. In RAL, 2023. 查看PDF
37. Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features, Chengkai Hou, Zhengrong Xue, Bingyang Zhou, Jinghan Ke, Shao Lin, Huazhe Xu†, https://jackhck.github.io/keygrid.github.io/, NeurIPS 2024.
36. Make-An-Agent: A Generalizable Policy NetworkGenerator with Behavior Prompted Diffusion, Yongyuan Liang, Tingqiang Xu, Kaizhe Hu, Guangqi Jiang, Furong Huang, Huazhe Xu†, https://cheryyunl.github.io/make-an-agent/, NeurIPS 2024.
35. RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation, Chongkai Gao, Zhengrong Xue, Shuying Deng, Tianhai Liang, Siqi Yang, Lin Shao, Huazhe Xu†, https://riemann-web.github.io/, CoRL 2024.
34. Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning, Zhecheng Yuan*, Tianming Wei*, Shuiqi Cheng, Gu Zhang, Yuanpei Chen, Huazhe Xu†, https://gemcollector.github.io/maniwhere/, CoRL 2024.
33. Learning Visual Quadrupedal Loco-Manipulation from Demonstrations, Zhengmao He, Kun Lei, Yanjie Ze, Koushil Sreenath, Zhongyu Li, Huazhe Xu†, https://zhengmaohe.github.io/leg-manip, IROS 2024.
32. Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation, Yuanchen Ju*, Kaizhe Hu*, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu†, ECCV 2024.
31. Diffusion Reward: Learning Rewards via Conditional Video Diffusion, Tao Huang*, Guangqi Jiang*, Yanjie Ze, Huazhe Xu†, ECCV 2024.
30. 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations, Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, Huazhe Xu, RSS 2024.
29. ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization, Tianying Ji*, Yongyuan Liang*, Yan Zeng, Yu Luo, Guowei Xu, Jiawei Guo, Ruijie Zheng, Furong Huang, Fuchun Sun, Huazhe Xu, https://arxiv.org/pdf/2402.14528, Oral, ICML 2024.
28. Rethinking Transformer in Solving POMDPs, Chenhao Lu*, Ruizhe Shi*, Yuyao Liu*, Kaizhe Hu, Simon Shaolei Du, Huazhe Xu, https://arxiv.org/abs/2405.17358, ICML 2024.
27. Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic,Tianying Ji, Yu Luo, Fuchun Sun, Xianyuan Zhan, Jianwei Zhang, Huazhe Xu,https://arxiv.org/abs/2306.02865, ICML 2024.
26. Dehao Wei, Huazhe Xu, A Wearable Robotic Hand for Hand-over-Hand Imitation Learning, ICRA 2024
25. Zhengrong Xue*, Han Zhang*, Jingwen Cheng, Zhengmao He, Yuanchen Ju, Changyi Lin, Gu Zhang, Huazhe Xu, ArrayBot: Reinforcement Learning for Generalizable Distributed Manipulation through Touch, ICRA 2024
24. Dehao Wei, Huazhe Xu, A Wearable Robotic Hand for Hand-over-Hand Imitation Learning, ICRA 2024
23. Guowei Xu, Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Zhecheng Yuan, Tianying Ji, Yu Luo, Xiaoyu Liu, Jiaxin Yuan, Pu Hua, Shuzhen Li, Yanjie Ze, Hal Daume III, Furong Huang, & Huazhe Xu, DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization. ICLR 2024
22. Xiyao Wang, Ruijie Zheng, Yanchao Sun, Ruonan Jia, Wichayaporn Wongkamjan, Huazhe Xu, Furong Huang, COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL, ICLR 2024
21. Ruizhe Shi*, Yuyao Liu*, Yanjie Ze, Simon S. Du, Huazhe Xu, Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning, ICLR 2024
20. Kun Lei, Zhengmao He, Chenhao Lu, Kaizhe Hu, Yang Gao, Huazhe Xu, Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization, ICLR 2024
19. Changyi Lin, Han Zhang, Jikai Xu, Lei Wu, Huazhe Xu, 9DTact: A Compact Vision-Based Tactile Sensor for Accurate 3D Shape Reconstruction and Generalizable 6D Force Estimation, IEEE Robotics and Automation Letters (RA-L), 2023 查看PDF
18. Zhecheng Yuan*, Sizhe Yang*, Pu Hua, Can Chang, Kaizhe Hu, Huazhe Xu, RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization, Conference on Neural Inforation Processing Systems (NeurIPS), 2023 查看PDF
17. Ruijie Zheng, Xiyao Wang, Yanchao Sun, Shuang Ma, Jieyu Zhao, Huazhe Xu+, Hal Daumé III+, Furong Huang+, TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning, Conference on Neural Information Processing Systems (NeurIPS), 2023 查看PDF
16. Jialu Gao*, Kaizhe Hu*, Guowei Xu, Huazhe Xu, Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?, Conference on Neural Information Processing Systems (NeurIPS), 2023 查看PDF
15. Sizhe Yang*, Yanjie Ze*, Huazhe Xu, MoVie: Visual Model-Based Policy Adaptation for View Generalization, Conference on Neural Information Processing Systems (NeurIPS), 2023 查看PDF
14. Yanjie Ze, Yuyao Liu*, Ruizhe Shi*, Jiaxin Qin, Zhecheng Yuan, Jiashun Wang, Huazhe Xu, H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation, Conference on Neural Information Processing Systems (NeurIPS), 2023 查看PDF
13. Jinxin Liu*, Li He*, Yachen Kang, Zifeng Zhuang, Donglin Wang, Huazhe Xu, CEIL: Generalized Contextual Imitation Learning, Conference on Neural Information Processing Systems (NeurIPS), 2023 查看PDF
12. Yuerong Li, Zhengrong Xue, Huazhe Xu, OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation, IEEE Winter Conference on Applications of Computer Vision (WACV), 2023 查看PDF
11. Nicklas Hansen*, Zhecheng Yuan*, Yanjie Ze*, Tongzhou Mu*, Aravind Rajeswaran+, Hao Su+, Huazhe Xu+, Xiaolong Wang+. On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline, International Conference on Machine Learning (ICML), 2023 查看PDF
10. Zhengrong Xue, Zhecheng Yuan, Jiashun Wang, Xueqian Wang, Yang Gao, Huazhe Xu. USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable Manipulation, International Conference on Robot Automation (ICRA), 2023 查看PDF
9. Changyi Lin, Ziqi Lin, Shaoxiong Wang, Huazhe Xu. DTact: A Vision-Based Tactile Sensor that Measures High-Resolution 3D Geometry Directly from Darkness, International Conference on Robot Automation (ICRA), 2023 查看PDF
8. Ray Chen Zheng*, Kaizhe Hu*, Zhecheng Yuan, Boyuan Chen, Huazhe Xu. Extraneousness-Aware Imitation Learning, International Conference on Robot Automation (ICRA), 2023 查看PDF
7. Yunfei Li*, Chaoyi Pan*, Huazhe Xu, Xiaolong Wang, Yi Wu. Efficient Bimanual Handover and Rearrangement via Symmetry-Aware Actor-Critic Learning, International Conference on Robot Automation (ICRA), 2023 查看PDF
6. Kaizhe Hu*, Ray Zheng*, Yang Gao, Huazhe Xu. Decision Transformer under Random Frame Dropping, International Conference on Learning Representation (ICLR), 2023 查看PDF
5. Pu Hua, Yubei Chen+, Huazhe Xu+. Simple Emergent Action Representations from Multi-Task Policy Training, International Conference on Learning Representation (ICLR), 2023 查看PDF
4. Linfeng Zhao, Huazhe Xu, Lawson L.S. Wong. Scaling up and Stabilizing Differentiable Planning with Implicit Differentiation, International Conference on Learning Representation (ICLR), 2023 查看PDF
3. Ruijie Zheng*, Xiyao Wang*, Huazhe Xu, Furong Huang, Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function, International Conference on Learning Representation (ICLR), 2023 查看PDF
2. Zhecheng Yuan, Zhengrong Xue, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, Huazhe Xu, Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning, Conference on Neural Information Processing Systems (NeurIPS), 2022 查看PDF
1. Can Chang, Ni Mu, Jiajun Wu, Ling Pan, Huazhe Xu, E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance, Conference on Neural Information Processing Systems (NeurIPS), 2022 查看PDF