9.5 Safe DreamerV3: Safe Reinforcement Learning with World Models
- Authors: Weidong Huang, Jiaming Ji, Borong Zhang, Chunhe Xia, Yaodong Yang
- Reason: This paper addresses an important gap in reinforcement learning - the safe application of the technology in complex scenarios. The concept of integrating both Lagrangian-based and planning-based methods within a world model is a novel contribution which, according to simulations, proves to be effective.
9.2 Safe Reinforcement Learning as Wasserstein Variational Inference: Formal Methods for Interpretability
- Authors: Yanran Wang, David Boyle
- Reason: The paper proposes a novel and practical approach, AWaVO, tackling some of the persistent challenges in the use of reinforcement learning. The extensive demonstrations of the approach, in both simulations and actual real robot tasks, suggest immense potential and solid theoretical foundation.
8.9 Robotic Manipulation Datasets for Offline Compositional Reinforcement Learning
- Authors: Marcel Hussing, Jorge A. Mendez, Anisha Singrodia, Cassandra Kent, Eric Eaton
- Reason: The entry’s significance lies in its potential to pioneer advancements in large-scale datasets for offline reinforcement learning. The provision of training and evaluation settings for assessing an agent’s ability to learn compositional task policies stands out as a valuable resource.
8.6 Reinforcement Learning with Frontier-Based Exploration via Autonomous Environment
- Authors: Kenji Leong
- Reason: The paper proposes an innovative solution to optimizing the efficiency and accuracy of Visual SLAM techniques critical in autonomous robotics. The integration of reinforcement learning to optimize exploration routes represents a valuable contribution with clear practical relevance.
8.1 Leveraging Factored Action Spaces for Off-Policy Evaluation
- Authors: Aaman Rebello, Shengpu Tang, Jenna Wiens, Sonali Parbhoo
- Reason: Notably, this paper investigates ways to mitigate bias and high variance in problems involving large, combinatorial action spaces. The proposed decomposed IS estimator based on factored action spaces offers a valuable innovation in the field, particularly helpful in problems involving off-policy evaluation.