- 9.7 HIQL: Offline Goal-Conditioned RL with Latent States as Actions
- Authors: Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine
- The paper proposes a novel method of goal-conditioned reinforcement learning from offline data. The technique, which utilizes an action-free value function in conjunction with two policies for predicting subgoals and subsequent actions, exhibits improvements in robustness and scale.
- 9.5 On-Robot Bayesian Reinforcement Learning for POMDPs
- Authors: Hai Nguyen, Sammie Katt, Yuchen Xiao, Christopher Amato
- The authors propose a framework for Bayesian reinforcement learning specifically adapted for physical robotic systems, demonstrating its potential efficiency through real-world human-robot interaction tasks – achieving near-optimal performance with minimal real-world episodes.
- 9.3 Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations
- Authors: Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Xiangyu Liu, Tuomas Sandholm, Furong Huang, Stephen McAleer
- The paper presents a novel approach to handle temporally-coupled perturbations in robust reinforcement learning scenarios, showcasing significant robustness improvements across various continuous control tasks.
- 9.1 Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs
- Authors: Qingyang Zhang, Yiming Yang, Jingqing Ruan, Xuantang Xiong, Dengpeng Xing, Bo Xu
- This paper proposes a novel approach for goal-conditioned hierarchical reinforcement learning that improves the balance between exploration and exploitation. Experimental results suggest superior performance compared to baselines on continuous control tasks with sparse rewards.
- 9.1 Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation
- Authors: Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal
- Reason: This paper presents a new and efficient reinforcement learning strategy, aiming to optimize the challenging scaling of off-policy methods. It’s an important step forward in speeding up reinforcement learning algorithms and was accepted in a top-tier conference (ICML 2023).
- 9.0 Learning from Pixels with Expert Observations
- Authors: Minh-Huy Hoang, Long Dinh, Hai Nguyen
- This paper introduces a method for learning pixel observations using expert observations in reinforcement learning tasks. Experimental results demonstrate the efficacy of the method in challenging block construction tasks, outperforming a hierarchical baseline.
- 8.9 A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning
- Authors: Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov
- Reason: This paper brings a new understanding to the connection between one-step RL and critic regularization, providing new perspectives in offline reinforcement learning. The paper has been accepted to ICML 2023 and is expected to be influential.
- 8.7 Contextual Bandits and Imitation Learning via Preference-Based Active Queries
- Authors: Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
- Reason: The paper presents an interesting and advanced combination of preference-based methods and online regression oracle for both contextual bandits and imitation learning, which may influence the understanding and development of both topics.
- 8.4 Analyzing the Strategy of Propaganda using Inverse Reinforcement Learning: Evidence from the 2022 Russian Invasion of Ukraine
- Authors: Dominique Geissler, Stefan Feuerriegel
- Reason: The authors leverage reinforcement learning to analyze propaganda strategies in a real-world scenario. The approach enhances the understanding of how reinforcement learning can be applied in social science, even though it’s not in a top-tier conference.
- 8.1 Uncertainty-aware Grounded Action Transformation towards Sim-to-Real Transfer for Traffic Signal Control
- Authors: Longchao Da, Hao Mei, Romir Sharma, Hua Wei
- Reason: The paper addresses a critical challenge in traffic signal control using reinforcement learning. The presented approach could have significant practical implications in dealing with sim-to-real challenges in reinforcement learning.