9.7 HIQL: Offline Goal-Conditioned RL with Latent States as Actions
- Authors: Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine
- The paper proposes a novel method of goal-conditioned reinforcement learning from offline data. The technique, which utilizes an action-free value function in conjunction with two policies for predicting subgoals and subsequent actions, exhibits improvements in robustness and scale.
9.5 On-Robot Bayesian Reinforcement Learning for POMDPs
- Authors: Hai Nguyen, Sammie Katt, Yuchen Xiao, Christopher Amato
- The authors propose a framework for Bayesian reinforcement learning specifically adapted for physical robotic systems, demonstrating its potential efficiency through real-world human-robot interaction tasks – achieving near-optimal performance with minimal real-world episodes.
9.3 Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations
- Authors: Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Xiangyu Liu, Tuomas Sandholm, Furong Huang, Stephen McAleer
- The paper presents a novel approach to handle temporally-coupled perturbations in robust reinforcement learning scenarios, showcasing significant robustness improvements across various continuous control tasks.
9.1 Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs
- Authors: Qingyang Zhang, Yiming Yang, Jingqing Ruan, Xuantang Xiong, Dengpeng Xing, Bo Xu
- This paper proposes a novel approach for goal-conditioned hierarchical reinforcement learning that improves the balance between exploration and exploitation. Experimental results suggest superior performance compared to baselines on continuous control tasks with sparse rewards.
9.1 Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation
- Authors: Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal
- Reason: This paper presents a new and efficient reinforcement learning strategy, aiming to optimize the challenging scaling of off-policy methods. It’s an important step forward in speeding up reinforcement learning algorithms and was accepted in a top-tier conference (ICML 2023).
9.0 Learning from Pixels with Expert Observations
- Authors: Minh-Huy Hoang, Long Dinh, Hai Nguyen
- This paper introduces a method for learning pixel observations using expert observations in reinforcement learning tasks. Experimental results demonstrate the efficacy of the method in challenging block construction tasks, outperforming a hierarchical baseline.
8.9 A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning
- Authors: Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov
- Reason: This paper brings a new understanding to the connection between one-step RL and critic regularization, providing new perspectives in offline reinforcement learning. The paper has been accepted to ICML 2023 and is expected to be influential.
8.7 Contextual Bandits and Imitation Learning via Preference-Based Active Queries
- Authors: Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
- Reason: The paper presents an interesting and advanced combination of preference-based methods and online regression oracle for both contextual bandits and imitation learning, which may influence the understanding and development of both topics.
8.4 Analyzing the Strategy of Propaganda using Inverse Reinforcement Learning: Evidence from the 2022 Russian Invasion of Ukraine
- Authors: Dominique Geissler, Stefan Feuerriegel
- Reason: The authors leverage reinforcement learning to analyze propaganda strategies in a real-world scenario. The approach enhances the understanding of how reinforcement learning can be applied in social science, even though it’s not in a top-tier conference.
8.1 Uncertainty-aware Grounded Action Transformation towards Sim-to-Real Transfer for Traffic Signal Control
- Authors: Longchao Da, Hao Mei, Romir Sharma, Hua Wei
- Reason: The paper addresses a critical challenge in traffic signal control using reinforcement learning. The presented approach could have significant practical implications in dealing with sim-to-real challenges in reinforcement learning.