9.4 Stable Online and Offline Reinforcement Learning for Antibody CDRH3 Design
- Authors: Yannick Vogt, Mehdi Naouar, Maria Kalweit, Christoph Cornelius Miething, Justus Duyster, Roland Mertelsmann, Gabriel Kalweit, Joschka Boedecker
- Reason: The paper introduces a novel reinforcement learning method in the critical domain of antibody-based therapeutics, demonstrating significant advancements in personalized cancer therapies. This intersection of RL with biotechnology, a high impact field, combined with claims of outperforming existing methods, suggests high potential influence.
9.1 Safe reinforcement learning in uncertain contexts
- Authors: Dominik Baumann, Thomas B. Schön
- Reason: Accepted final version to appear in the IEEE Transactions on Robotics, which is a high-impact journal indicating influential work. The paper addresses safety in machine learning applications, which is fundamental to the deployment of RL in real-world scenarios.
9.0 Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments
- Authors: Antoine Dedieu, Wolfgang Lehrach, Guangyao Zhou, Dileep George, Miguel Lázaro-Gredilla
- Reason: Includes authors with a track record in the field (Dileep George - co-founder of Vicarious, a known AI company) and addresses the integration of transformers with cognitive maps in RL, a novel and potentially high-impact approach.
8.9 RFRL Gym: A Reinforcement Learning Testbed for Cognitive Radio Applications
- Authors: Daniel Rosen, Illa Rochez, Caleb McIrvin, Joshua Lee, Kevin D’Alessandro, Max Wiecek, Nhan Hoang, Ramzy Saffarini, Sam Philips, Vanessa Jones, Will Ivey, Zavier Harris-Smart, Zavion Harris-Smart, Zayden Chin, Amos Johnson, Alyse M. Jones, William C. Headley
- Reason: Develops a simulation environment that could significantly accelerate the development of reinforcement learning techniques in wireless communication, crucial for emerging technologies like 6G. The decision to open-source the codebase will likely boost its impact in the research community.
8.9 Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
- Authors: Yuanzhao Zhai, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Ding Bo, Huaimin Wang
- Reason: Proposes a new method ORPO that outperforms P-MDP baselines by a significant margin, suggesting a high potential impact on the field of model-based offline RL.
8.7 Spatial-Aware Deep Reinforcement Learning for the Traveling Officer Problem
- Authors: Niklas Strauß, Matthias Schubert
- Reason: Presented at the SIAM SDM conference, this work tackles the TOP problem using a spatial-aware approach and shows considerable improvements over existing methods, indicating its potential for impact in spatially structured RL tasks.
8.6 Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning
- Authors: Ding Chen, Peixi Peng, Tiejun Huang, Yonghong Tian
- Reason: Addresses the challenge of building energy-efficient artificial intelligence through spiking neural networks, which is critical for sustainable AI development. Has the potential to reduce the energy footprint of AI, with broader implications for real-world control tasks that demand energy efficiency.
8.6 Bounds on the price of feedback for mistake-bounded online learning
- Authors: Jesse Geneson, Linus Tang
- Reason: Although the primary focus is online learning, the paper discusses improving bounds on delayed reinforcement learning and might indirectly influence future RL research. The authors have contributed to understanding theoretical aspects of learning algorithms.
8.3 Towards Safe Load Balancing based on Control Barrier Functions and Deep Reinforcement Learning
- Authors: Lam Dinh, Pham Tran Anh Quang, Jérémie Leguay
- Reason: Combines DRL with Control Barrier Functions to ensure safe and reliable network performance in SD-WAN. This integration of safety into DRL widens its application in commercial solutions where reliability is critical, suggesting impactful industrial relevance.
8.1 Graph Q-Learning for Combinatorial Optimization
- Authors: Victoria M. Dax, Jiachen Li, Kevin Leahy, Mykel J. Kochenderfer
- Reason: Introduces a GNN-based approach to combinatorial optimization, a field with broad applications. Promising preliminary results indicate that this method might compete with state-of-the-art heuristic-based solvers. If scalable, it could impact numerous industries that rely on optimization.