8.9 Federated Q-Learning: Linear Regret Speedup with Low Communication Cost
- Authors: Zhong Zheng, Fengyu Gao, Lingzhou Xue, Jing Yang
- Reason: The paper tackles the significant challenge of achieving linear regret speedup with minimal communication costs in federated reinforcement learning, which is a critical issue for distributed RL applications. The innovative approach and comprehensive analysis suggest it could be highly influential in the RL community.
8.7 A Trust Region Approach for Few-Shot Sim-to-Real Reinforcement Learning
- Authors: Paul Daoudi, Christophe Prieur, Bogdan Robu, Merwan Barlier, Ludovic Dos Santos
- Reasons: Introduces a novel Trust Region method bridging the gap between simulation and real-world RL, high relevance for real-world applications, boost in performance across most tested scenarios.
8.6 Scaling Is All You Need: Training Strong Policies for Autonomous Driving with JAX-Accelerated Reinforcement Learning
- Authors: Moritz Harmel, Anubhav Paras, Andreas Pasternak, Gary Linscott
- Reason: This paper’s focus on applying large-scale reinforcement learning to the field of autonomous driving, an area with considerable commercial and research interest, combined with the introduction of a hardware-accelerated simulator and multi-GPU learning framework could position it as a pivotal resource in the industry.
8.5 Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs
- Authors: Tianyuan Jin, Hao-Lun Hsu, William Chang, Pan Xu
- Reasons: Provides theoretical bounds and addresses open problems in the MAMAB setting; to appear in the AAAI’2024 conference, a significant venue that increases potential influence.
8.3 Gradient Shaping for Multi-Constraint Safe Reinforcement Learning
- Authors: Yihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu, Ding Zhao
- Reason: Introducing a gradient shaping technique to address multi-constraint safe reinforcement learning could have significant implications for the development of safe RL algorithms, an essential consideration for real-world deployment of RL systems.
8.2 Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
- Authors: Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang, Yang Yu
- Reasons: Tackles generalization and sample efficiency in OMRL under data limitations, accepted by AAAI 2024, and introduces novel techniques which will likely have a ripple effect on future research.
8.1 Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
- Authors: Md Saiful Islam, Srijita Das, Sai Krishna Gottipati, William Duguay, Clodéric Mars, Jalal Arabneydi, Antoine Fagette, Matthew Guzdial, Matthew-E-Taylor
- Reason: The exploration of effective human-AI collaboration in complex environments and the development of a new simulator could offer insights into enhancing RL agent learning and performance, contributing to the advancement of Human-in-the-Loop learning approaches.
8.0 Context-aware Communication for Multi-agent Reinforcement Learning
- Authors: Xinran Li, Jun Zhang
- Reasons: Accepted by AAMAS 2024, offers an innovative approach to communication in MARL which is likely to influence future work in cooperative and communication-constrained scenarios.
7.9 Reinforcement Unlearning
- Authors: Dayong Ye, Tianqing Zhu, Congcong Zhu, Derui Wang, Jason, Sheng Shen, Wanlei Zhou
- Reasons: Addresses a novel and important aspect of reinforcement learning related to data privacy and unlearning, which is increasingly critical under data protection laws.
7.8 Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling
- Authors: Xianjie Zhang, Jiahao Sun, Chen Gong, Kai Wang, Yifei Cao, Hao Chen, Hao Chen, Yu Liu
- Reason: By incorporating mutual information as an intrinsic reward within the reinforcement learning framework for on-demand ride pooling, this work could lead to improved resource distribution strategies and efficiency, influencing future transportation and logistic systems.