- 9.7 RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation
- Authors: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Tom Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess
- The paper has significant potential influence due its exploration of a new category of robotic learning. Not only it involves reinforcement learning but develops a new agent named RoboCat, capable of generalizing to new tasks and robots quickly.
- 9.6 Bootstrapped Representations in Reinforcement Learning
- Authors: Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney
- Reason: Published at ICML 2023, solid theoretical foundations
- 9.5 CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning
- Authors: Nikunj Gupta, Samira Ebrahimi Kahou
- Reason: The authors propose a new multi-agent reinforcement learning (MARL) algorithm which introduces the idea of modeling the actions of other agents accurately. Given the potential to elevate MARL capabilities, this paper could have a substantial impact on future research in multi-agent systems, making it influential.
- 9.3 Active Policy Improvement from Multiple Black-box Oracles
- Authors: Xuefeng Liu, Takuma Yoneda, Chaoqi Wang, Matthew R. Walter, Yuxin Chen
- Reason: A unique approach to multiple suboptimal oracles with empirical results tightening advantages over competitors
- 9.2 Practical First-Order Bayesian Optimization Algorithms
- Authors: Utkarsh Prakash, Aryan Chollera, Kushagra Khatwani, Prabuchandran K.J., Tejas Bodas
- Reason: First Order Bayesian Optimization (FOBO) has become an effective approach for global maxima of expensive-to-evaluate black-box objective functions. This paper presents practical FOBO algorithms with impressive results constructing multi-level acquisition functions, and successful applications in machine learning and reinforcement learning tasks.
- 9.1 Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork
- Authors: Yonggang Jin, Chenxu Wang, Liuyu Xiang, Yaodong Yang, Jie Fu, Zhaofeng He
- Reason: A novel approach to utilize past experiences with promising results against strong baseline in MiniGrid environment
- 9.0 AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents
- Authors: Timothée Mathieu, Riccardo Della Vecchia, Alena Shilova, Matheus Centa de Medeiros, Hector Kohler, Odalric-Ambrym Maillard, Philippe Preux
- Reason: Addresses the reproducibility crisis in Deep Reinforcement Learning (RL), introducing a new statistical test to distinguish performance of various algorithms. Could influence how RL algorithms are evaluated and compared in the future.
- 8.9 The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
- Authors: Nishil Patel, Sebastian Lee, Stefano Sarao Mannelli, Sebastian Goldt, Adrew Saxe
- Reason: Hinges on the challenge of policy learning in high-dimensional settings with novel model proposals
- 8.9 Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
- Authors: Dongsheng Ding, Chen-Yu Wei, Kaiqing Zhang, Alejandro Ribeiro
- Significant contribution to the study of reinforcement learning for constrained Markov Decision Processes. This paper is influential as it proposes the policy-based primal-dual algorithms and as a result it may enable the development of more efficient and effective learning algorithms.
- 8.7 Maximum Entropy Heterogeneous-Agent Mirror Learning
- Authors: Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang
- Reason: Innovative theoretical framework proposed, that leverages the maximum entropy principle for multi-agent RL
- 8.7 Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management
- Authors: Marc Velay, Bich-Liên Doan, Arpad Rimmel, Fabrice Popineau, Fabrice Daniel
- Reason: The authors propose a new training and evaluation process for DRL algorithms in portfolio management. Given the increased popularity of machine learning in finance, this paper could be influential in shaping the approach to DRL in financial applications.
- 8.5 Effect-Invariant Mechanisms for Policy Generalization
- Authors: Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, Jonas Peters
- Reason: Introduces a new relaxation approach for policy learning called effect-invariance (e-invariance) that allows for zero-shot policy generalization. It’s applicable to various real-world learning systems which makes it important.
- 8.3 Inter-Cell Network Slicing With Transfer Learning Empowered Multi-Agent Deep Reinforcement Learning
- Authors: Tianlun Hu, Qi Liao, Qiang Liu, Georg Carle
- This work could be influential in the field of network slicing in reinforcement learning. It presents the DIRP algorithm and the TL-DIRP algorithm, which have potential to improve service performance and reduce exploration cost.
- 8.1 Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
- Authors: Hang Wang, Sen Lin, Junshan Zhang
- This paper is formulated in the context of Warm-Start reinforcement learning and presents valuable insights on the impact of approximation errors. Its implications could be influential for understanding and enhancing the performance of Warm-Start RL.
- 7.6 Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization
- Authors: Matias Alvo, Daniel Russo, Yash Kanoria
- This paper proposes a novel neural network architecture for inventory management optimization using deep reinforcement learning. It could be considered influential due to its potential applications in practical inventory management solutions.