9.3 H-GAP: Humanoid Control with a Generalist Planner
- Authors: Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian
- Reason: High-quality list of authors from prominent institutions, addresses a fundamental problem in reinforcement learning with real-world applications, and demonstrates superiority over baselines and offline RL methods.
9.2 Training Reinforcement Learning Agents and Humans With Difficulty-Conditioned Generators
- Authors: Sidney Tio, Jimmy Ho, Pradeep Varakantham
- Reason: Introduces a novel method for both training RL agents and human learners, which can have significant impact on personalized education and adaptive training systems.
8.9 When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
- Authors: Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White
- Reason: Discusses the fundamental limits of offline policy selection, a crucial step for deployment of reliable RL systems, and could influence future research on RL sample efficiency and deployment.
8.7 AdsorbRL: Deep Multi-Objective Reinforcement Learning for Inverse Catalysts Design
- Authors: Romain Lacombe, Lucas Hendren, Khalid El-Awady
- Reason: Addresses a central problem in clean energy technologies and demonstrates the potential of RL in material science, a field that could substantially benefit from advances in AI.
8.7 A Q-learning approach to the continuous control problem of robot inverted pendulum balancing
- Authors: Mohammad Safeea, Pedro Neto
- Reason: Interesting application of Q-learning to a continuous control problem, strong experimental validation, and potential impact on real-world robotic systems.
8.5 RL-Based Cargo-UAV Trajectory Planning and Cell Association for Minimum Handoffs, Disconnectivity, and Energy Consumption
- Authors: Nesrine Cherif, Wael Jaafar, Halim Yanikomeroglu, Abbas Yongacoglu
- Reason: Proposes a RL approach to solve a practical and emerging problem in logistics with UAVs, which has potential applications and impact on the future of autonomous delivery systems.
8.5 Lights out: training RL agents robust to temporary blindness
- Authors: N. Ordonez, M. Tromp, P. M. Julbe, W. Böhmer
- Reason: Introduces an innovative concept with significant implications for reinforcement learning in dynamic environments, contributing to the robustness of RL agents.
8.3 MASP: Scalable GNN-based Planning for Multi-Agent Navigation
- Authors: Xinyi Yang, Xinting Yang, Chao Yu, Jiayu Chen, Huazhong Yang, Yu Wang
- Reason: Introduces a GNN-based hierarchical planning framework for multi-agent navigation, relevant for complex cooperation strategies and potentially influential in the development of multi-agent systems.
8.2 Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions: Applications to Product-Form Stochastic Networks and Queueing Systems
- Authors: Céline Comte, Matthieu Jonckheere, Jaron Sanders, Albert Senen-Cerda
- Reason: Novel approach to policy-gradient methods which could improve convergence issues in RL, applicable to a variety of important domains.
8.0 LExCI: A Framework for Reinforcement Learning with Embedded Systems
- Authors: Kevin Badalian, Lucas Koch, Tobias Brinkmann, Mario Picerno, Marius Wegener, Sung-Yong Lee, Jakob Andert
- Reason: Addresses the significant challenge of RL implementation on embedded systems, which is crucial for practical deployment of RL in real-world scenarios.