- 9.4 RL en Markov Games with Independent Function Approximation: Improved Sample Complexity Bound under the Local Access Model
- Authors: Junyi Fan, Yuxuan Han, Jialin Zeng, Jian-Feng Cai, Yang Wang, Yang Xiang, Jiheng Zhang
- Reason: The paper was accepted at the prestigious 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), which indicates peer recognition in the field. Furthermore, it improves upon existing sample complexity bounds in reinforcement learning, a core aspect of the field, and provides new insights into learning equilibria in Markov games.
- 9.2 The Value of Reward Lookahead in Reinforcement Learning
- Authors: Nadav Merlis, Dorian Baudry, Vianney Perchet
- Reason: This study provides novel insights into the value of future reward information in RL, a significant and practical aspect in many applications. The paper contributes to theoretical understanding, bridging the gap with offline RL and reward-free exploration.
- 9.1 Learning to Watermark LLM-generated Text via Reinforcement Learning
- Authors: Xiaojun Xu, Yuanshun Yao, Yang Liu
- Reason: Introduces an innovative application of reinforcement learning for watermarking LLM outputs, which is a novel and potentially influential method for content protection and misuse tracking.
- 9.1 Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization
- Authors: Sai Prasanna, Karim Farid, Raghu Rajan, André Biedenkapp
- Reason: This paper proposes a novel approach, the contextual recurrent state-space model (cRSSM), for zero-shot generalization in contextual reinforcement learning. The interdisciplinary team of authors and comprehensive experimentation highlight the paper’s potential high impact on domains requiring robust generalization capabilities.
- 9.0 State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards
- Authors: Yuto Tanimoto, Kenji Fukumizu
- Reason: The paper introduces a new RL algorithm, SS-SARSA, that effectively deals with the setting of recovering bandits, representing a common real-world problem. Additionally, it claims to offer lower computational complexity and has proven convergence to an optimal policy, which could make it influential in practice.
- 8.9 ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
- Authors: Anthony Liang, Jesse Thomason, Erdem Bıyık
- Reason: The paper introduces ViSaRL, a method incorporating human visual attention mechanisms into reinforcement learning. The integration of human-like visual saliency could notably advance the field and the impressive improvement demonstrated on various tasks indicates the potential influence of this work.
- 8.8 A Scalable and Parallelizable Digital Twin Framework for Sustainable Sim2Real Transition of Multi-Agent Reinforcement Learning Systems
- Authors: Chinmay Vilas Samak, Tanmay Vilas Samak, Venkat Krovi
- Reason: This work outlines a scalable multi-agent deep reinforcement learning framework for Sim2Real transition, addressing a critical challenge in the field. The introduction of the AutoDRIVE Ecosystem as a digital twin framework is potentially influential for its practical implications in robotics and related areas.
- 8.8 Offline Multitask Representation Learning for Reinforcement Learning
- Authors: Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin, Doina Precup
- Reason: The involvement of Doina Precup, a renowned authority in RL, adds weight to this paper. It addresses the emerging area of offline multitask representation learning, which is crucial for developing more general and robust RL models.
- 8.7 PERL: Parameter Efficient Reinforcement Learning from Human Feedback
- Authors: Hakim Sidahmed, Samrat Phatale, Alex Hutcheson, Zhuonan Lin, Zhang Chen, Zac Yu, Jarvis Jin, Roman Komarytsia, Christiane Ahlheim, Yonghao Zhu, Simral Chaudhary, Bowen Li, Saravanan Ganesh, Bill Byrne, Jessica Hoffmann, Hassan Mansoor, Wei Li, Abhinav Rastogi, Lucas Dixon
- Reason: Tackles the computational efficiency challenge in RLHF which aligns with the current trend towards more cost-effective and resource-efficient machine learning methods.
- 8.7 A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
- Authors: Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart
- Reason: The paper’s focus on improving sample efficiency in CVaR optimization via a new mixture policy parameterization could have significant repercussions, given the increasing interest in risk-sensitive reinforcement learning. The empirical studies further enhance the paper’s potential influence.
- 8.6 Phasic Diversity Optimization for Population-Based Reinforcement Learning
- Authors: Jingcheng Jiang, Haiyin Piao, Yu Fu, Yihang Hao, Chuanlu Jiang, Ziqi Wei, Xin Yang
- Reason: Offers an innovative optimization algorithm that decouples reward and diversity optimizing, countering a common obstacle in diversity reinforcement learning. While less tested than others on this list, the conceptual framework suggests a strong potential for future influence.
- 8.6 Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data
- Authors: Danyang Wang, Chengchun Shi, Shikai Luo, Will Wei Sun
- Reason: This work tackles the significant issue of leveraging large observational datasets in RL, circumventing the common unconfoundedness and positivity assumptions. Due to its applicability to real-world datasets, as demonstrated with a leading ride-hailing platform, it has the potential for substantial impact.
- 8.5 Riemannian Flow Matching Policy for Robot Motion Learning
- Authors: Max Braun, Noémie Jaquier, Leonel Rozo, Tamim Asfour
- Reason: Proposes a new model for learning visuomotor policies in robotics, which is a critical area in reinforcement learning applications.
- 8.3 Latent Object Characteristics Recognition with Visual to Haptic-Audio Cross-modal Transfer Learning
- Authors: Namiko Saito, Joao Moura, Hiroki Uchida, Sethu Vijayakumar
- Reason: Introduces a cross-modal transfer learning approach which could enhance the capability of robots in recognizing object characteristics, contributing to advancements in interactive robotics.
- 7.9 Diffusion-Reinforcement Learning Hierarchical Motion Planning in Adversarial Multi-agent Games
- Authors: Zixuan Wu, Sean Ye, Manisha Natarajan, Matthew C. Gombolay
- Reason: Offers a hierarchical architecture combining diffusion models with RL in a multi-agent adversarial context, which could have significant impact on strategic planning and robotics.