1. 9.9 Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards
  2. 9.6 Learning impartial policies for sequential counterfactual explanations using Deep Reinforcement Learning
  3. 9.5 Expressive Modeling Is Insufficient for Offline RL: A Tractable Inference Perspective
  4. 9.3 The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback
  5. 9.3 Latent Space Translation via Semantic Alignment
  6. 9.1 Rethinking Decision Transformer via Hierarchical Reinforcement Learning
  7. 9.0 Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning
  8. 9.0 Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
  9. 8.9 Recovering Linear Causal Models with Latent Variables via Cholesky Factorization of Covariance Matrix
  10. 8.8 Federated Natural Policy Gradient Methods for Multi-task Reinforcement Learning