- 9.6 Transferable Reinforcement Learning via Generalized Occupancy Models
- Authors: Chuning Zhu, Xinqi Wang, Tyler Han, Simon S. Du, Abhishek Gupta
- Reason: The concept of Generalized Occupancy Models (GOMs) has the potential to significantly impact the field of transferable RL. The inclusion of prominent authors like Simon S. Du and Abhishek Gupta, combined with a strong theoretical foundation and empirical results across simulated robotics problems, contributes to its high influence score.
- 9.4 Risk-Sensitive RL with Optimized Certainty Equivalents via Reduction to Standard RL
- Authors: Kaiwen Wang, Dawen Liang, Nathan Kallus, Wen Sun
- Reason: The paper addresses the important but less-studied area of Risk-Sensitive Reinforcement Learning (RSRL) and provides theoretical groundings along with novel frameworks applicable to policy optimization. The presence of authoritative figures in the field such as Wen Sun indicates its potential for substantial influence.
- 9.4 Generalising Multi-Agent Cooperation through Task-Agnostic Communication
- Authors: Dulhan Jayalath, Steven Morad, Amanda Prorok
- Reason: Significantly advances the efficiency of communication strategies in MARL, with a strong potential for impact in cooperative multi-robot systems.
- 9.2 Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning
- Authors: Shuo Tang, Rui Ye, Chenxin Xu, Xiaowen Dong, Siheng Chen, Yanfeng Wang
- Reason: The algorithm proposed, DeLAMA, is highly relevant due to its potential to support intelligent, decentralized, and dynamic multi-agent systems. Furthermore, the combination of theoretical and experimental validation, along with code availability, bolsters its potential influence in the multi-agent learning domain.
- 9.2 On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
- Authors: Navdeep Kumar, Yashaswini Murthy, Itai Shufaro, Kfir Y. Levy, R. Srikant, Shie Mannor
- Reason: Provides the first finite-time global convergence analysis in a significant area of RL, contributing to foundational theoretical understanding.
- 9.0 Provable Policy Gradient Methods for Average-Reward Markov Potential Games
- Authors: Min Cheng, Ruida Zhou, P. R. Kumar, Chao Tian
- Reason: The paper presents a comprehensive mathematical treatment of Markov potential games under the average reward criterion and provides convergence proofs and time complexity analysis for policy gradient methods. The author P. R. Kumar is a recognized authority in the field.
- 8.9 Physics-informed Neural Motion Planning on Constraint Manifolds
- Authors: Ruiqi Ni, Ahmed H. Qureshi
- Reason: Addresses the challenging problem of Constrained Motion Planning with an innovative physics-informed approach, showing substantial practical improvements and accepted at a top-tier conference (ICRA).
- 8.9 RLingua: Improving Reinforcement Learning Sample Efficiency in Robotic Manipulations With Large Language Models
- Authors: Liangliang Chen, Yutian Lei, Shiyu Jin, Ying Zhang, Liangjun Zhang
- Reason: The innovative approach of integrating large language models to improve sample efficiency in RL presents a unique angle that may influence future research, particularly in robotic manipulation tasks, backed by substantial empirical evidence of performance improvement.
- 8.9 In-context Exploration-Exploitation for Reinforcement Learning
- Authors: Zhenwen Dai, Federico Tomasi, Sina Ghiassian
- Reason: Addresses critical computational costs in RL and demonstrates a substantial reduction in the required number of episodes for learning tasks.
- 8.8 Shielded Deep Reinforcement Learning for Complex Spacecraft Tasking
- Authors: Robert Reed, Hanspeter Schaub, Morteza Lahijanian
- Reason: Tackles a highly specialized and critical application of reinforcement learning in spacecraft control with proposed safeguarding methods, featuring authors with strong backgrounds in aerospace engineering.
- 8.8 Scalable Online Exploration via Coverability
- Authors: Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy
- Reason: The proposed exploration objective, $L_1$-Coverage, potentially offers a new, efficient method for online RL exploration. The influence scores slightly lower due to its specialized nature compared to the broader applications of some previously listed works, but it scores well given the relevance of the topic and the authorities among the authors.
- 8.7 Extending Activation Steering to Broad Skills and Multiple Behaviours
- Authors: Teun van der Weij, Massimo Poesio, Nandi Schoots
- Reason: The paper explores the novel use of activation steering to manage and mitigate risks associated with large language models, a timely topic in deep learning and AI safety.
- 8.7 Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning
- Authors: Junseok Park, Yoonsung Kim, Hee Bin Yoo, Min Whoo Lee, Kibeom Kim, Won-Seok Choi, Minsu Lee, Byoung-Tak Zhang
- Reason: Introduces an inspirational approach to reward transitioning in RL that may offer insights into human learning processes and potential advancements in sample efficiency and success rates.
- 8.6 Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence
- Authors: Marcel Hussing, Claas Voelcker, Igor Gilitschenski, Amir-massoud Farahmand, Eric Eaton
- Reason: Offers an insightful analysis of value overestimation in deep RL and proposes a solution that shows promise on challenging benchmarks. The authors have expertise in relevant areas which lends credibility to their findings.
- 8.5 Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification
- Authors: Joar Skalse, Alessandro Abate
- Reason: Investigates the robustness of IRL against misspecification, which is crucial for real-world applications, and could influence further research on behavioral model accuracy.