1. 9.3 H-GAP: Humanoid Control with a Generalist Planner
  2. 9.2 Training Reinforcement Learning Agents and Humans With Difficulty-Conditioned Generators
  3. 8.9 When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
  4. 8.7 AdsorbRL: Deep Multi-Objective Reinforcement Learning for Inverse Catalysts Design
  5. 8.7 A Q-learning approach to the continuous control problem of robot inverted pendulum balancing
  6. 8.5 RL-Based Cargo-UAV Trajectory Planning and Cell Association for Minimum Handoffs, Disconnectivity, and Energy Consumption
  7. 8.5 Lights out: training RL agents robust to temporary blindness
  8. 8.3 MASP: Scalable GNN-based Planning for Multi-Agent Navigation
  9. 8.2 Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions: Applications to Product-Form Stochastic Networks and Queueing Systems
  10. 8.0 LExCI: A Framework for Reinforcement Learning with Embedded Systems