1. 9.8 Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning
  2. 9.7 On Double-Descent in Reinforcement Learning with LSTD and Random Features
  3. 9.6 Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control
  4. 9.5 Model-based Robotic Manipulation Skill Transfer via Differentiable Physics Simulation
  5. 9.5 Multi-timestep models for Model-based Reinforcement Learning
  6. 9.4 Surgical Gym: A high-performance GPU-based platform for reinforcement learning with surgical robots
  7. 9.4 Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
  8. 9.3 Beyond Text: A Deep Dive into Large Language Models’ Ability on Understanding Graph Data
  9. 9.3 Hierarchical Reinforcement Learning for Temporal Pattern Prediction
  10. 9.2 Self-Confirming Transformer for Locally Consistent Online Adaptation in Multi-Agent Reinforcement Learning
  11. 9.2 DeepQTest: Testing Autonomous Driving Systems with Reinforcement Learning and Real-world Weather Data
  12. 9.1 FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility
  13. 9.1 Distributional Reinforcement Learning with Online Risk-awareness Adaption
  14. 9.1 Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning
  15. 9.0 Deep Model Predictive Optimization
  16. 9.0 TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting
  17. 9.0 DSAC-T: Distributional Soft Actor-Critic with Three Refinements
  18. 8.9 Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks
  19. 8.8 Offline Imitation Learning with Variational Counterfactual Reasoning
  20. 8.6 Optimal Sequential Decision-Making in Geosteering: A Reinforcement Learning Approach