1. 9.4 Harder Tasks Need More Experts: Dynamic Routing in MoE Models
  2. 8.9 Multi-Agent Reinforcement Learning with a Hierarchy of Reward Machines
  3. 8.9 Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
  4. 8.7 Ant Colony Sampling with GFlowNets for Combinatorial Optimization
  5. 8.7 Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding
  6. 8.5 Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
  7. 8.5 CardioGenAI: A Machine Learning-Based Framework for Re-Engineering Drugs for Reduced hERG Liability
  8. 8.2 Advantage-Aware Policy Optimization for Offline Reinforcement Learning
  9. 8.2 Efficient Knowledge Deletion from Trained Models through Layer-wise Partial Machine Unlearning
  10. 8.0 Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and Transformer