- 9.9 Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards
- Authors: Alain Andres, Daochen Zha, Javier Del Ser
- Reason: This paper proposes tailored self-imitation learning strategies that significantly contribute to reinforcement learning. The authors conduct a comprehensive experimental analysis, achieving a new state-of-the-art performance in the MiniGrid-MultiRoom-N12-S10 environment.
- 9.6 Learning impartial policies for sequential counterfactual explanations using Deep Reinforcement Learning
- Authors: E. Panagiotou, E. Ntoutsi
- Reason: The authors propose an innovative approach to intelligently inform reinforcement learning decisions with sequential counterfactual examples. The novel methods they discuss hold potential to enhance reinforcement learning’s scalability.
- 9.5 Expressive Modeling Is Insufficient for Offline RL: A Tractable Inference Perspective
- Authors: Xuejie Liu, Anji Liu, Guy Van den Broeck, Yitao Liang
- This paper highlights the importance of tractability in reinforcement learning, proposing Trifle, which bridges the gap between high-quality sequence models and performance of offline RL tasks. The authors claim Trifle outperforms prior approaches, and as such, this paper could influence future research in offline RL tasks.
- 9.3 The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback
- Authors: Nathan Lambert, Roberto Calandra
- This paper focuses on addressing the challenges in reinforcement learning from human feedback (RLHF). They discuss the objective mismatch issue and propose potential solutions. Given the current interest in human feedback for RL, this paper is likely to carry considerable influence.
- 9.3 Latent Space Translation via Semantic Alignment
- Authors: Valentino Maiorca, Luca Moschella, Antonio Norelli, Marco Fumero, Francesco Locatello, Emanuele Rodolà
- Reason: This paper introduces a novel method that seamlessly translates between different pre-trained neural networks. This transformative capability holds potential across various applications of reinforcement learning.
- 9.1 Rethinking Decision Transformer via Hierarchical Reinforcement Learning
- Authors: Yi Ma, Chenjun Xiao, Hebin Liang, Jianye Hao
- The paper revisits the Decision Transformer (DT) algorithm, discussing its limitations and proposing a new framework incorporating Hierarchical RL. Given the popularity of the Decision Transformer in the field, the proposals in this paper may lead to significant changes in future RL methods.
- 9.0 Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning
- Authors: Sleiman Safaoui, Abraham P. Vinod, Ankush Chakrabarty, Rien Quirynen, Nobuyuki Yoshikawa, Stefano Di Cairano
- Given its focus on drone motion planning under uncertainty, backed by a blend of reinforcement learning and constrained-control-based trajectory planning, this paper could be a key reference for researchers working on real-world implementation of reinforcement learning algorithms.
- 9.0 Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
- Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng
- Reason: The authors notably improve our understanding of regret-matching algorithms in the context of games, making significant strides in reinforcement learning. This paper can directly impact the design of more efficient learning algorithms for games.
- 8.9 Recovering Linear Causal Models with Latent Variables via Cholesky Factorization of Covariance Matrix
- Authors: Yunfeng Cai, Xu Li, Minging Sun, Ping Li
- Reason: Though not purely focused on reinforcement learning, this paper presents first-of-its-kind methods for recovering directed acyclic graph structures, which will significantly impact the effectiveness of reinforcement learning algorithms.
- 8.8 Federated Natural Policy Gradient Methods for Multi-task Reinforcement Learning
- Authors: Tong Yang, Shicong Cen, Yuting Wei, Yuxin Chen, Yuejie Chi
- This paper introduces new developments in policy optimization for multi-task reinforcement learning in a federated setting. It could attract interest due to its method of combining reinforcement learning with federated learning, a growing area of interest.