1. 9.9 A Q-learning Approach for Adherence-Aware Recommendations
  2. 9.5 Offline Prompt Evaluation and Optimization with Inverse Reinforcement Learning
  3. 9.3 Reasoning with Latent Diffusion in Offline Reinforcement Learning
  4. 9.1 Attention Loss Adjusted Prioritized Experience Replay
  5. 9.0 Safe Reinforcement Learning with Dual Robustness