- 9.3 H-GAP: Humanoid Control with a Generalist Planner
- Authors: Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian
- Reason: High-quality list of authors from prominent institutions, addresses a fundamental problem in reinforcement learning with real-world applications, and demonstrates superiority over baselines and offline RL methods.
- 9.2 Training Reinforcement Learning Agents and Humans With Difficulty-Conditioned Generators
- Authors: Sidney Tio, Jimmy Ho, Pradeep Varakantham
- Reason: Introduces a novel method for both training RL agents and human learners, which can have significant impact on personalized education and adaptive training systems.
- 8.9 When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
- Authors: Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White
- Reason: Discusses the fundamental limits of offline policy selection, a crucial step for deployment of reliable RL systems, and could influence future research on RL sample efficiency and deployment.
- 8.7 AdsorbRL: Deep Multi-Objective Reinforcement Learning for Inverse Catalysts Design
- Authors: Romain Lacombe, Lucas Hendren, Khalid El-Awady
- Reason: Addresses a central problem in clean energy technologies and demonstrates the potential of RL in material science, a field that could substantially benefit from advances in AI.
- 8.7 A Q-learning approach to the continuous control problem of robot inverted pendulum balancing
- Authors: Mohammad Safeea, Pedro Neto
- Reason: Interesting application of Q-learning to a continuous control problem, strong experimental validation, and potential impact on real-world robotic systems.
- 8.5 RL-Based Cargo-UAV Trajectory Planning and Cell Association for Minimum Handoffs, Disconnectivity, and Energy Consumption
- Authors: Nesrine Cherif, Wael Jaafar, Halim Yanikomeroglu, Abbas Yongacoglu
- Reason: Proposes a RL approach to solve a practical and emerging problem in logistics with UAVs, which has potential applications and impact on the future of autonomous delivery systems.
- 8.5 Lights out: training RL agents robust to temporary blindness
- Authors: N. Ordonez, M. Tromp, P. M. Julbe, W. Böhmer
- Reason: Introduces an innovative concept with significant implications for reinforcement learning in dynamic environments, contributing to the robustness of RL agents.
- 8.3 MASP: Scalable GNN-based Planning for Multi-Agent Navigation
- Authors: Xinyi Yang, Xinting Yang, Chao Yu, Jiayu Chen, Huazhong Yang, Yu Wang
- Reason: Introduces a GNN-based hierarchical planning framework for multi-agent navigation, relevant for complex cooperation strategies and potentially influential in the development of multi-agent systems.
- 8.2 Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions: Applications to Product-Form Stochastic Networks and Queueing Systems
- Authors: Céline Comte, Matthieu Jonckheere, Jaron Sanders, Albert Senen-Cerda
- Reason: Novel approach to policy-gradient methods which could improve convergence issues in RL, applicable to a variety of important domains.
- 8.0 LExCI: A Framework for Reinforcement Learning with Embedded Systems
- Authors: Kevin Badalian, Lucas Koch, Tobias Brinkmann, Mario Picerno, Marius Wegener, Sung-Yong Lee, Jakob Andert
- Reason: Addresses the significant challenge of RL implementation on embedded systems, which is crucial for practical deployment of RL in real-world scenarios.