- 9.2 SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning
- Authors: Dohyeok Lee, Seungyub Han, Taehyun Cho, Jungwoo Lee
- Reason: Presents a novel theoretical angle on ensemble methods in Q-learning, which is fundamental to addressing overestimation bias in deep reinforcement learning—a critical area for advancing complex tasks.
- 9.0 GLIDE-RL: Grounded Language Instruction through DEmonstration in RL
- Authors: Chaitanya Kharyal, Sai Krishna Gottipati, Tanmay Kumar Sinha, Srijita Das, Matthew E. Taylor
- Reason: Introduces an innovative teacher-student curriculum framework for training RL agents in natural language grounding, an important and rapidly growing subfield of reinforcement learning.
- 9.0 Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
- Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer
- Reason: The paper is accepted at a well-regarded conference (AAMAS’24), addresses a novel and challenging aspect of reinforcement learning (decentralization and fault-tolerance), and the authors include Roger Wattenhofer, who is an established authority in distributed computing.
- 8.9 Human as AI Mentor: Enhanced Human-in-the-loop Reinforcement Learning for Safe and Efficient Autonomous Driving
- Authors: Zilin Huang, Zihao Sheng, Chengyuan Ma, Sikai Chen
- Reason: Proposes a novel human-in-the-loop framework for autonomous driving which effectively integrates human expertise, with potential high impact for safety-critical applications in mixed traffic environments.
- 8.8 Long-term Safe Reinforcement Learning with Binary Feedback
- Authors: Akifumi Wachi, Wataru Hashimoto, Kazumune Hashimoto
- Reason: This paper is accepted at a top conference (AAAI-24), introduces a significant advancement in safe reinforcement learning which is crucial for real-world applications, and the authors are associated with credible institutions.
- 8.7 Decision Making in Non-Stationary Environments with Policy-Augmented Search
- Authors: Ava Pettet, Yunuo Zhang, Baiting Luo, Kyle Wray, Hendrik Baier, Aron Laszka, Abhishek Dubey, Ayan Mukhopadhyay
- Reason: Offers a new approach combining policy learning with online search in non-stationary environments, with strong theoretical backing and implications for domains where instantaneous adaptation is crucial.
- 8.6 LLMs for Robotic Object Disambiguation
- Authors: Connie Jiang, Yiqing Xu, David Hsu
- Reason: Despite not being accepted in a top-tier conference yet, the integration of LLMs for robotic perception and decision-making is an emerging area of research in reinforcement learning, and the authors’ institutions imply expertise in robotics and AI.
- 8.5 MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
- Authors: Rafael Rafailov, Kyle Hatch, Victor Kolev, John D. Martin, Mariano Phielipp, Chelsea Finn
- Reason: Addresses the challenge of offline to online transition for model-based reinforcement learning in high-dimensional spaces, likely to influence robotics and real-world applications where simulation to reality transfer is key.
- 8.4 A Tensor Network Implementation of Multi Agent Reinforcement Learning
- Authors: Sunny Howard
- Reason: It presents an innovative approach using tensor networks in MARL, which could potentially alleviate computational challenges, however, being an MSc Thesis, it might have less immediate impact compared to peer-reviewed conference papers.
- 8.2 Using reinforcement learning to improve drone-based inference of greenhouse gas fluxes
- Authors: Alouette van Hove, Kristoffer Aalstad, Norbert Pirk
- Reason: The application of RL for environmental monitoring is increasingly important and has practical implications, however, the potential influence may be slightly lower as the field is very specialized and the paper hasn’t mentioned acceptance in a leading conference.