1. 9.6 Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward
  2. 9.4 Multi-agent Reinforcement Learning: A Comprehensive Survey
  3. 9.3 CACTO-SL: Using Sobolev Learning to improve Continuous Actor-Critic with Trajectory Optimization
  4. 9.2 Constrained Meta-Reinforcement Learning for Adaptable Safety Guarantee with Differentiable Convex Programming
  5. 9.1 Pareto Envelope Augmented with Reinforcement Learning
  6. 8.9 Active Reinforcement Learning for Robust Building Control
  7. 8.9 GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation
  8. 8.7 Deriving Rewards for Reinforcement Learning from Symbolic Behaviour Descriptions of Bipedal Walking
  9. 8.7 Deep-Dispatch: A Deep Reinforcement Learning-Based Vehicle Dispatch Algorithm for Advanced Air Mobility
  10. 8.6 Colored Noise in PPO: Improved Exploration and Performance Through Correlated Action Sampling
  11. 8.5 Learning to Act without Actions
  12. 8.3 Explaining Reinforcement Learning Agents Through Counterfactual Action Outcomes
  13. 8.1 Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis
  14. 7.8 Challenges for Reinforcement Learning in Quantum Computing
  15. 7.5 Monte Carlo Tree Search in the Presence of Transition Uncertainty