选择语言
< 返回主菜单

​强化学习的数据效率和算法性能方面系列进展

2024-05-06


1714969433961.jpg


       许华哲团队主要围绕强化学习算法领域的数据效率和算法性能等方向,开展了一系列攻关研究,四项成果收录于此次ICLR会议,例如DrM显著提升了视觉强化学习的数据效率,COPlanner 显著提升了基于模型的强化学习的数据效率,LaMo利用预训练模型提升了离线强化学习的能力,Uni-O4则是将离线强化学习与在线强化学习连接,相关成果对自动化控制和机器人学的发展研究具有重要意义。

许华哲UNI 2.png

论文标题:Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization.

论文作者:Kun Lei, Zhengmao He, Chenhao Lu, Kaizhe Hu, Yang Gao, Huazhe Xu

论文链接:https://arxiv.org/abs/2311.03351

论文标题:DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization.

论文作者:Guowei Xu, Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Zhecheng Yuan, Tianying Ji, Yu Luo, Xiaoyu Liu, Jiaxin Yuan, Pu Hua, Shuzhen Li, Yanjie Ze, Hal Daume III, Furong Huang, & Huazhe Xu

项目链接:https://xugw-kevin.github.io/drm

论文标题: COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

论文作者: Xiyao Wang, Ruijie Zheng, Yanchao Sun, Ruonan Jia, Wichayaporn Wongkamjan, Huazhe Xu, Furong Huang

项目链接:https://github.com/umd-huang-lab/COPlanner

论文标题: Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

论文作者: Ruizhe Shi*, Yuyao Liu*, Yanjie Ze, Simon S. Du, Huazhe Xu

项目链接:https://lamo2023.github.io