1. 8.9 ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling
  2. 8.6 An Off-Policy Reinforcement Learning Algorithm Customized for Multi-Task Fusion in Large-Scale Recommender Systems
  3. 8.4 Using Deep Q-Learning to Dynamically Toggle between Push/Pull Actions in Computational Trust Mechanisms
  4. 8.3 Knowledge Transfer for Cross-Domain Reinforcement Learning: A Systematic Review
  5. 8.1 Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies
  6. 8.0 Center-Based Relaxed Learning Against Membership Inference Attacks
  7. 7.9 Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
  8. 7.8 Learning Manipulation Tasks in Dynamic and Shared 3D Spaces
  9. 7.7 SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning Policies
  10. 7.5 DPO Meets PPO: Reinforced Token Optimization for RLHF