1. 8.9 Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
  2. 8.7 Overcoming Negative Transfer in Continual Reinforcement Learning
  3. 8.4 Simulating Battery-Powered TinyML Systems Optimised using Reinforcement Learning in Image-Based Anomaly Detection
  4. 8.2 Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
  5. 8.0 Switching the Loss Reduces the Cost in Batch Reinforcement Learning