1. 9.4 Scale-free Adversarial Reinforcement Learning
  2. 9.2 Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning
  3. 9.0 On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
  4. 9.0 Iterated $Q$-Network: Beyond the One-Step Bellman Operator
  5. 8.9 Mixed-Strategy Nash Equilibrium for Crowd Navigation
  6. 8.7 Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning
  7. 8.7 A Case for Validation Buffer in Pessimistic Actor-Critic
  8. 8.7 Quantized Hierarchical Federated Learning: A Robust Approach to Statistical Heterogeneity
  9. 8.7 Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy
  10. 8.5 Offline Fictitious Self-Play for Competitive Games
  11. 8.5 PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning
  12. 8.5 Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
  13. 8.3 Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization
  14. 8.3 SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction
  15. 8.2 Tsallis Entropy Regularization for Linearly Solvable MDP and Linear Quadratic Regulator
  16. 8.0 LiMAML: Personalization of Deep Recommender Models via Meta Learning
  17. 7.9 Enhancing Long-Term Recommendation with Bi-level Learnable Large Language Model Planning
  18. 7.9 In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
  19. 7.9 Koopman-Assisted Reinforcement Learning
  20. 7.4 Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks