- 9.8 Explaining Reinforcement Learning with Shapley Values
- Authors: Daniel Beechey, Thomas M. S. Smith, Özgür Şimşek
- Accepted at ICML 2023, this paper offers a significant contribution in the area of reinforcement learning and explainability. It uses game theory principles for understandable and trustworthy reinforcement learning systems.
- 9.7 On the Importance of Exploration for Generalization in Reinforcement Learning
- Authors: Yiding Jiang, J. Zico Kolter, Roberta Raileanu
- This paper discusses the importance of exploration in deep reinforcement learning for improving generalization and proposes an innovative method, EDE, that encourages exploration of states with high epistemic uncertainty.
- 9.6 Robust Reinforcement Learning via Adversarial Kernel Approximation
- Authors: Kaixin Wang, Uri Gadot, Navdeep Kumar, Kfir Levy, Shie Mannor
- This paper proposes the unique concept of using robust Markov Decision Processes for sequential decision-making in reinforcement learning, paving way for applicability in realistic high-dimensional domain situations.
- 9.4 Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
- Authors: Ezgi Korkmaz, Jonah Brown-Cohen
- Approved for publication in ICML 2023, this paper tackles policy instability issues in MDPs by determining the threshold of safe and adversarial observations, promising robustness against adversarial attack scenarios.
- 9.3 On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning
- Authors: Hojoon Lee, Koanho Lee, Dongyoon Hwang, Hyunho Lee, Byungkun Lee, Jaegul Choo
- The authors propose a novel URL framework that predicts future states while increasing the dimension of the latent manifold by decorrelating the features in the latent space, which is proven to be effective for Reinforcement Learning in the Atari 100k benchmark.
- 9.2 The Role of Diverse Replay for Generalisation in Reinforcement Learning
- Authors: Max Weltevrede, Matthijs T.J. Spaan, Wendelin Böhmer
- This paper investigates the impact of exploration strategy and replay buffer in multi-task RL. It presents an insight that collecting and training on more diverse data from the training environment can improve zero-shot generalisation.
- 9.2 TreeDQN: Learning to minimize Branch-and-Bound tree
- Authors: Dmitry Sorokin, Alexander Kostin
- This paper proposes the cutting-edge method of framing selection tasks as a tree Markov Decision Process in reinforcement learning for combinatorial optimization problems requiring exhaustive search.
- 9.1 Approximate information state based convergence analysis of recurrent Q-learning
- Authors: Erfan Seyedsalehi, Nima Akbarzadeh, Amit Sinha, Aditya Mahajan
- Highlighting the reinforcement learning community’s lack of theoretical understanding of POMDPs, the paper establishes the convergence of recurrent Q-learning in tabular settings, tackling the practical issue of non-Markovian agent state.
- 9.0 In-Sample Policy Iteration for Offline Reinforcement Learning
- Authors: Xiaohan Hu, Yi Ma, Chenjun Xiao, Yan Zheng, Zhaopeng Meng
- By proposing in-sample policy iteration, the paper offers a substantial enhancement to behavior-regularized methods in offline RL. It provides a novel algorithm that utilizes actions well-covered by the dataset, improving over previous state-of-the-art methods.
- 8.9 Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach
- Authors: Donghwan Lee
- The paper presents a finite-time analysis of the minimax Q-learning algorithm and the corresponding value iteration method. By providing insights into minimax Q-learning, the paper uncovers novel connections between the fields of control theory and reinforcement learning communities.