1. 9.8 Policy-Based Self-Competition for Planning Problems
  2. 9.6 End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
  3. 9.5 Balancing of competitive two-player Game Levels with Reinforcement Learning
  4. 9.3 Rethinking Weak Supervision in Helping Contrastive Learning
  5. 9.3 Timing Process Interventions with Causal Inference and Reinforcement Learning
  6. 9.2 Reinforcement Learning-Based Control of CrazyFlie 2.X Quadrotor
  7. 9.1 Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL
  8. 9.1 Dual policy as self-model for planning
  9. 9.0 Agent Performing Autonomous Stock Trading under Good and Bad Situations
  10. 9.0 Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models
  11. 9.0 Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
  12. 8.8 Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
  13. 8.8 Causally Learning an Optimal Rework Policy
  14. 8.7 Finding Counterfactually Optimal Action Sequences in Continuous State Spaces
  15. 8.6 Generalization Across Observation Shifts in Reinforcement Learning
  16. 8.5 NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating True Coverage
  17. 8.3 Divide and Repair: Using Options to Improve Performance of Imitation Learning Against Adversarial Demonstrations
  18. 8.0 Convergence of SARSA with linear function approximation: The random horizon case
  19. 7.9 Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
  20. 7.6 Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design