9.2 Scaling Instructable Agents Across Many Simulated Worlds
- Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi, Zhitao Gong, Lucy Gonzales, Karol Gregor, Arne Olav Hallingstad, Tim Harley, Sam Haves, Felix Hill, Ed Hirst, Drew A. Hudson, Steph Hughes-Fitt, Danilo J. Rezende, Mimi Jasarevic, Laura Kampis, Rosemary Ke, Thomas Keck, Junkyung Kim, Oscar Knagg, Kavya Kopparapu, Andrew Lampinen, Shane Legg, Alexander Lerchner, Marjorie Limont, Yulan Liu, Maria Loks-Thompson, Joseph Marino, Kathryn Martin Cussons, Loic Matthey
- Reason: The paper outlines the ambitious SIMA project for building general AI with broad applications, including language-driven tasks in complex 3D environments. The strong team of researchers from DeepMind and the extensive scope suggest high potential influence in the field of reinforcement learning.
9.0 Model-based Offline Quantum Reinforcement Learning
- Authors: Simon Eisenmann, Daniel Hein, Steffen Udluft, Thomas A. Runkler
- Reason: This paper presents a novel approach to quantum reinforcement learning, an emerging field that may offer significant computational advantages. The authors have expertise in quantum machine learning, and this work represents an intersection of cutting-edge reinforcement learning and quantum computing.
9.0 Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms
- Authors: Zehao Zhou
- Reason: Introduces an innovative off-policy RL algorithm with potential for high-dimensional continuous control, a critical area in RL development.
8.8 Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay
- Authors: Jinmei Liu, Wenbin Li, Xiangyu Yue, Shilin Zhang, Chunlin Chen, Zhi Wang
- Reason: Proposes a novel approach for continual learning in RL, showing impressive abilities to mitigate catastrophic forgetting and facilitating transfer learning in offline tasks.
8.7 Offline Trajectory Generalization for Offline Reinforcement Learning
- Authors: Ziqi Zhao, Zhaochun Ren, Liu Yang, Fajie Yuan, Pengjie Ren, Zhumin Chen, jun Ma, Xin Xin
- Reason: The proposed OTTO method addresses key challenges in offline RL, such as generalization and data augmentation. The paper shows strong experimental results, which may have an influential impact on the development of efficient offline RL algorithms.
8.6 Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning
- Authors: Hao-Lun Hsu, Weixin Wang, Miroslav Pajic, Pan Xu
- Reason: Provides a first-of-its-kind exploration of efficient randomized exploration in cooperative MARL with theoretical guarantees, a significant contribution to multi-agent systems.
8.5 Warm-Start Variational Quantum Policy Iteration
- Authors: Nico Meyer, Jakob Murauer, Alexander Popov, Christian Ufrecht, Axel Plinge, Christopher Mutschler, Daniel D. Scherer
- Reason: Variational Quantum Policy Iteration is an innovative approach linking quantum computing with reinforcement learning to solve complex decision-making scenarios. The potential of quantum advantage warrants the significance of this research.
8.4 Settling Constant Regrets in Linear Markov Decision Processes
- Authors: Weitong Zhang, Zhiyuan Fan, Jiafan He, Quanquan Gu
- Reason: Introduces a novel algorithm with a constant regret into the field of RL with linear function approximation, providing robustness against model misspecification.
8.3 EyeFormer: Predicting Personalized Scanpaths with Transformer-Guided Reinforcement Learning
- Authors: Yue Jiang, Zixin Guo, Hamed Rezazadegan Tavakoli, Luis A. Leiva, Antti Oulasvirta
- Reason: EyeFormer introduces an application that can have a significant impact on user interface design and personalized content delivery. Its novel use of transformers within a reinforcement learning framework for predicting individual visual behavior shows the potential for practical influence.
8.2 TENG: Time-Evolving Natural Gradient for Solving PDEs with Deep Neural Net
- Authors: Zhuo Chen, Jacob McCarran, Esteban Vizcaino, Marin Soljačić, Di Luo
- Reason: Develops an advanced optimization method for neural-network-based PDE solutions, which could indirectly influence RL applications in control tasks involving physical simulations.