9.5 Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation
- Authors: Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang
- Reason: Proposed a novel algorithm achieving horizon-free and instance-dependent regret bounds, strong theoretical grounding with minimax optimality, and computational efficiency demonstrated through comprehensive experiments.
9.1 Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization
- Authors: Carlos E. Luis, Alessandro G. Bottero, Julia Vinogradska, Felix Berkenkamp, Jan Peters
- Reason: Addresses uncertainty quantification in model-based RL with novel UBE, contributing to more efficient exploration and improved performance, backed by solid experiments and established authors in the field.
8.9 Pearl: A Production-ready Reinforcement Learning Agent
- Authors: Zheqing Zhu, Rodrigo de Salvo Braz, Jalaj Bhandari, Daniel Jiang, Yi Wan, Yonathan Efroni, Liyuan Wang, Ruiyang Xu, Hongbo Guo, Alex Nikulkov, Dmytro Korenkevych, Urun Dogan, Frank Cheng, Zheng Wu, Wanqiao Xu
- Reason: Contributes a complete, production-ready RL system with demonstrated industry adoption; high-authority author pool with real-world applications
8.7 Generalization to New Sequential Decision Making Tasks with In-Context Learning
- Authors: Sharath Chandra Raparthy, Eric Hambro, Robert Kirk, Mikael Henaff, Roberta Raileanu
- Reason: Tackles the key challenge of task generalization in RL with a novel approach using transformers, providing insights into influential design choices for in-context learning
8.7 A Scalable Network-Aware Multi-Agent Reinforcement Learning Framework for Decentralized Inverter-based Voltage Control
- Authors: Han Xu, Jialin Zheng, Guannan Qu
- Reason: Addresses scalability and communication issues for voltage control in power grids using MARL, presenting a framework with practical implications and a provable approximation guarantee.
8.5 Similarity-based Knowledge Transfer for Cross-Domain Reinforcement Learning
- Authors: Sergio A. Serrano, Jose Martinez-Carranza, L. Enrique Sucar
- Reason: Addresses a challenging topic of knowledge transfer across different domains which can significantly speed up RL processes, and presents a novel method that does not require aligned data sets
8.3 MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator
- Authors: Xiao-Yin Liu, Xiao-Hu Zhou, Guo-Tao Li, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou
- Reason: Proposes a novel algorithm that balances robustness and performance in offline RL scenarios, likely to influence further research in making RL algorithms more practical for real-world applications
8.3 Using Large Language Models for Hyperparameter Optimization
- Authors: Michael R. Zhang, Nishkrit Desai, Juhan Bae, Jonathan Lorraine, Jimmy Ba
- Reason: Innovative use of large language models for HPO, demonstrating comparable or superior performance to traditional methods in empirical evaluations, novel approach to treating code as a hyperparameter.
8.1 FoMo Rewards: Can we cast foundation models as reward functions?
- Authors: Ekdeep Singh Lubana, Johann Brehmer, Pim de Haan, Taco Cohen
- Reason: Investigates an innovative approach to using foundation models as generic reward functions in RL, bridging the gap between high-capacity models and interactive task learning
7.9 Deep Dynamics: Vehicle Dynamics Modeling with a Physics-Informed Neural Network for Autonomous Racing
- Authors: John Chrosniak, Jingyun Ning, Madhur Behl
- Reason: Introduces a physics-informed neural network for vehicle dynamics modeling, catering to a specialized yet challenging area of autonomous racing, with potential for significant impact on the field.