1. 9.2 Learning and Calibrating Heterogeneous Bounded Rational Market Behaviour with Multi-Agent Reinforcement Learning
  2. 9.0 Dense Reward for Free in Reinforcement Learning from Human Feedback
  3. 8.9 Introducing PetriRL: An Innovative Framework for JSSP Resolution Integrating Petri nets and Event-based Reinforcement Learning
  4. 8.9 Neural Style Transfer with Twin-Delayed DDPG for Shared Control of Robotic Manipulators
  5. 8.7 Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning
  6. 8.7 Distilling Conditional Diffusion Models for Offline Reinforcement Learning through Trajectory Stitching
  7. 8.6 Leveraging Approximate Model-based Shielding for Probabilistic Safety Guarantees in Continuous Environments
  8. 8.5 Behind the Myth of Exploration in Policy Gradients
  9. 8.3 Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach
  10. 8.1 Adaptive Primal-Dual Method for Safe Reinforcement Learning