1. 9.1 Kernelized Offline Contextual Dueling Bandits
  2. 8.9 Model-based Offline Reinforcement Learning with Count-based Conservatism
  3. 8.8 Diverse Offline Imitation via Fenchel Duality
  4. 8.5 Towards practical reinforcement learning for tokamak magnetic control
  5. 8.4 Exploring reinforcement learning techniques for discrete and continuous control tasks in the MuJoCo environment