1. 9.1 The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization
  2. 9.1 Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation
  3. 8.9 Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data
  4. 8.9 Uncertainty-aware Distributional Offline Reinforcement Learning
  5. 8.7 An Analysis of Switchback Designs in Reinforcement Learning
  6. 8.7 VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts
  7. 8.5 Imitating Cost-Constrained Behaviors in Reinforcement Learning
  8. 8.5 CMP: Cooperative Motion Prediction with Multi-Agent Communication
  9. 8.3 Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study
  10. 8.3 Retentive Decision Transformer with Adaptive Masking for Reinforcement Learning based Recommendation Systems