- 9.6 Gradient Informed Proximal Policy Optimization
- Authors: Sanghyun Son, Laura Yu Zheng, Ryan Sullivan, Yi-Ling Qiao, Ming C. Lin
- Reason: Integrating analytical gradients with PPO could be influential in training more efficient policies, especially since the paper was also presented at NeurIPS, which suggests high peer recognition.
- 9.3 Improve Robustness of Reinforcement Learning against Observation Perturbations via $l_\infty$ Lipschitz Policy Networks
- Authors: Buqing Nie, Jingtian Ji, Yangqing Fu, Yue Gao
- Reason: Addresses robustness in DRL which is key for real-world applications, and being accepted at AAAI suggests authority in field.
- 9.1 World Models via Policy-Guided Trajectory Diffusion
- Authors: Marc Rigter, Jun Yamada, Ingmar Posner
- Reason: Addresses a fundamental challenge in world modeling with a novel, non-autoregressive approach, promising competitive performance and applicable advancements in on-policy reinforcement learning with potential for broad impact due to its innovation.
- 9.0 Global Rewards in Multi-Agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems
- Authors: Heiko Hoppe, Tobias Enders, Quentin Cappart, Maximilian Schiffer
- Reason: The paper proposes a novel MADRL algorithm which could significantly impact large-scale real-world systems like AMoD, and involves a counterfactual approach which could be widely influential.
- 8.7 Markov Decision Processes with Noisy State Observation
- Authors: Amirhossein Afsharrad, Sanjay Lall
- Reason: Develops new algorithms considering the realistic aspect of noisy observations in MDPs and shows the practical effectiveness; thus, this paper may greatly influence robust reinforcement learning in noisy environments.
- 8.7 Vision-Language Models as a Source of Rewards
- Authors: Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald, Luyu Wang, Lei Zhang
- Reason: Utilizing vision-language models (VLMs) as rewards for RL widens the learning possibilities across language goals, with an impressive author list from DeepMind which adds to its potential influence.
- 8.5 Personalized Decision Supports based on Theory of Mind Modeling and Explainable Reinforcement Learning
- Authors: Huao Li, Yao Fan, Keyang Zheng, Michael Lewis, Katia Sycara
- Reason: Blending personalization and explainability in decision support systems, and validated through experiments shows high potential for influence especially due to having been accepted to IEEE SMC 2023.
- 8.5 Learning Safety Constraints From Demonstration Using One-Class Decision Trees
- Authors: Mattijs Baert, Sam Leroux, Pieter Simoens
- Reason: Focuses on crucial safety-constraint learning in RL and offers interpretable results, which is highly relevant for real-world applications like autonomous driving, albeit with potentially less immediate impact than the papers listed above.
- 8.3 Omega-Regular Decision Processes
- Authors: Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak
- Reason: Introduces a new class of decision processes for non-Markovian systems with a clear path to optimizing and learning for these systems, signaling a significant advance in reinforcement learning formalisms.
- 8.0 Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks
- Authors: Giovanni Luca Marchetti, Christopher Hillar, Danica Kragic, Sophia Sanborn
- Reason: Provides profound theoretical insights into the emergence of Fourier features in neural networks, proposing foundational algebraic learning theory that may influence the understanding of invariance and symmetry in learning systems.