- 8.9 Federated Q-Learning: Linear Regret Speedup with Low Communication Cost
- Authors: Zhong Zheng, Fengyu Gao, Lingzhou Xue, Jing Yang
- Reason: The paper tackles the significant challenge of achieving linear regret speedup with minimal communication costs in federated reinforcement learning, which is a critical issue for distributed RL applications. The innovative approach and comprehensive analysis suggest it could be highly influential in the RL community.
- 8.7 A Trust Region Approach for Few-Shot Sim-to-Real Reinforcement Learning
- Authors: Paul Daoudi, Christophe Prieur, Bogdan Robu, Merwan Barlier, Ludovic Dos Santos
- Reasons: Introduces a novel Trust Region method bridging the gap between simulation and real-world RL, high relevance for real-world applications, boost in performance across most tested scenarios.
- 8.6 Scaling Is All You Need: Training Strong Policies for Autonomous Driving with JAX-Accelerated Reinforcement Learning
- Authors: Moritz Harmel, Anubhav Paras, Andreas Pasternak, Gary Linscott
- Reason: This paper’s focus on applying large-scale reinforcement learning to the field of autonomous driving, an area with considerable commercial and research interest, combined with the introduction of a hardware-accelerated simulator and multi-GPU learning framework could position it as a pivotal resource in the industry.
- 8.5 Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs
- Authors: Tianyuan Jin, Hao-Lun Hsu, William Chang, Pan Xu
- Reasons: Provides theoretical bounds and addresses open problems in the MAMAB setting; to appear in the AAAI’2024 conference, a significant venue that increases potential influence.
- 8.3 Gradient Shaping for Multi-Constraint Safe Reinforcement Learning
- Authors: Yihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu, Ding Zhao
- Reason: Introducing a gradient shaping technique to address multi-constraint safe reinforcement learning could have significant implications for the development of safe RL algorithms, an essential consideration for real-world deployment of RL systems.
- 8.2 Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
- Authors: Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang, Yang Yu
- Reasons: Tackles generalization and sample efficiency in OMRL under data limitations, accepted by AAAI 2024, and introduces novel techniques which will likely have a ripple effect on future research.
- 8.1 Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
- Authors: Md Saiful Islam, Srijita Das, Sai Krishna Gottipati, William Duguay, Clodéric Mars, Jalal Arabneydi, Antoine Fagette, Matthew Guzdial, Matthew-E-Taylor
- Reason: The exploration of effective human-AI collaboration in complex environments and the development of a new simulator could offer insights into enhancing RL agent learning and performance, contributing to the advancement of Human-in-the-Loop learning approaches.
- 8.0 Context-aware Communication for Multi-agent Reinforcement Learning
- Authors: Xinran Li, Jun Zhang
- Reasons: Accepted by AAMAS 2024, offers an innovative approach to communication in MARL which is likely to influence future work in cooperative and communication-constrained scenarios.
- 7.9 Reinforcement Unlearning
- Authors: Dayong Ye, Tianqing Zhu, Congcong Zhu, Derui Wang, Jason, Sheng Shen, Wanlei Zhou
- Reasons: Addresses a novel and important aspect of reinforcement learning related to data privacy and unlearning, which is increasingly critical under data protection laws.
- 7.8 Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling
- Authors: Xianjie Zhang, Jiahao Sun, Chen Gong, Kai Wang, Yifei Cao, Hao Chen, Hao Chen, Yu Liu
- Reason: By incorporating mutual information as an intrinsic reward within the reinforcement learning framework for on-demand ride pooling, this work could lead to improved resource distribution strategies and efficiency, influencing future transportation and logistic systems.