9.8 Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning
- Authors: Jiayu Chen, Zelai Xu, Yunfei Li, Chao Yu, Jiaming Song, Huazhong Yang, Fei Fang, Yu Wang, Yi Wu
- Reason: The authors of the paper have promising authorities in reinforcement learning and the topic is highly relevant to the current research trend. The paper provides a novel framework for multi-agent reinforcement learning and has promising applications in complex games.
9.7 On Double-Descent in Reinforcement Learning with LSTD and Random Features
- Authors: David Brellmann, Eloïse Berthier, David Filliat, Goran Frehse
- Reason: The listed authors are reputed, and they contributed to the field by presenting a theoretical analysis of the influence of network size and $l_2$-regularization on performance in reinforcement learning, identifying crucial factors that can impact performance.
9.6 Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control
- Authors: Shicong Cen, Yuejie Chi
- The paper provides significant theoretical insights on policy gradient methods and introduces a way to reach global convergence efficiently, providing new possibilities to enhance the performance of Reinforcement Learning.
9.5 Model-based Robotic Manipulation Skill Transfer via Differentiable Physics Simulation
- Authors: Yuqi Xiang, Feitong Chen, Qinsi Wang, Yang Gang, Xiang Zhang, Xinghao Zhu, Xingyu Liu, Lin Shao
- Reason: The authors of this paper come up with a novel framework which has the potential to be highly influential. The framework leverages differentiable physics simulations for efficient transfer of robotic skills.
9.5 Multi-timestep models for Model-based Reinforcement Learning
- Authors: Abdelhakim Benechehab, Giuseppe Paolo, Albert Thomas, Maurizio Filippone, Balázs Kégl
- Reason: The paper proposed a multi-timestep objective to train reinforcement learning models, advancing the field by significantly improving the long-horizon R2 score and the performance of models in real-world applications.
9.4 Surgical Gym: A high-performance GPU-based platform for reinforcement learning with surgical robots
- Authors: Samuel Schmidgall, Axel Krieger, Jason Eshraghian
- Reason: The authors introduce an open-source high-performance platform for surgical robot learning. Apart from having a highly important application, the research paper’s authors have an authoritative background with a history of relevant contributions.
9.4 Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
- Authors: Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu
- This paper introduces a significant perspective of reward-consistency to improve the generalizability of Offline Reinforcement Learning models, showing promising results in varying benchmarks.
9.3 Beyond Text: A Deep Dive into Large Language Models’ Ability on Understanding Graph Data
- Authors: Yuntong Hu, Zheng Zhang, Liang Zhao
- Reason: The paper features well-respected authors and takes an in-depth look into a cutting-edge topic. It could be of high influence in the field of large language models and graph-based data analysis.
9.3 Hierarchical Reinforcement Learning for Temporal Pattern Prediction
- Authors: Faith Johnson, Kristin Dana
- Reason: The authors developed a novel model combining deep learning and hierarchical reinforcement learning, demonstrating significant improvements to training speed, stability and prediction accuracy.
9.2 Self-Confirming Transformer for Locally Consistent Online Adaptation in Multi-Agent Reinforcement Learning
- Authors: Tao Li, Juan Guevara, Xinghong Xie, Quanyan Zhu
- Reason: This paper is tackling a critical issue in offline reinforcement learning (RL), which is the distribution shift between the offline dataset and the online environment. The authors are reputable in the field, and the paper’s topic is at the forefront of the current RL research.
9.2 DeepQTest: Testing Autonomous Driving Systems with Reinforcement Learning and Real-world Weather Data
- Authors: Chengjie Lu, Tao Yue, Man Zhang, Shaukat Ali
- This paper presents the DeepQTest framework that leverages Reinforcement Learning to generate realistic test scenarios for Autonomous Driving Systems, demonstrating improved effectiveness against the established baselines.
9.1 FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility
- Authors: Lang Feng, Dong Xing, Junru Zhang, Gang Pan
- Reason: FP3O is a versatile multi-agent PPO algorithm for cooperative MARL, and the authors provide a solid theoretical foundation for policy improvement. This paper could have a significant impact on the field of multi-agent cooperation and policy optimization.
9.1 Distributional Reinforcement Learning with Online Risk-awareness Adaption
- Authors: Yupeng Wu, Wenjie Huang
- This work brings forward a novel framework, DRL-ORA, which adapts risk level dynamically and demonstrates superior performance over static or manually predetermined risk level adaption methods.
9.1 Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning
- Authors: Trevor McInroe, Stefano V. Albrecht, Amos Storkey
- Reason: The authors introduced a novel algorithm to leverage the benefits of limited interaction budget in reinforcement learning and demonstrated their approach in complex control tasks.
9.0 Deep Model Predictive Optimization
- Authors: Jacob Sacks, Rwik Rana, Kevin Huang, Alex Spitzer, Guanya Shi, Byron Boots
- Reason: This work aims to design robust policies for complex and agile behaviors in robotics. The authors have reliable authority, and the described innovative methods and resulting improvements make this paper potentially influential in Robotics and Machine Learning.
9.0 TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting
- Authors: Defu Cao, Furong Jia, Sercan O Arik, Tomas Pfister, Yixiang Zheng, Wen Ye, Yan Liu
- Reason: The authors propose a new framework for time series modeling that leverages Generative Pre-trained Transformer (GPT) models. This could be a significant step forward in time series forecasting.
9.0 DSAC-T: Distributional Soft Actor-Critic with Three Refinements
- Authors: Jingliang Duan, Wenxuan Wang, Liming Xiao, Jiaxin Gao, Shengbo Eben Li
- Reason: The authors proposed refinements to the standard reinforcement learning model that delivered superior performance across varying reward scales and ensured a highly stable learning process.
8.9 Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks
- Authors: Andrew Starnes, Anton Dereventsov, Clayton Webster
- Techniques described in the paper propose regularization to improve the diversity of actions taken in reinforcement learning tasks and exhibits significantly improved performance across various personalization tasks.
8.8 Offline Imitation Learning with Variational Counterfactual Reasoning
- Authors: Bowei He, Zexu Sun, Jinxin Liu, Shuai Zhang, Xu Chen, Chen Ma
- Reason: The ability to learn from offline data is important in reinforcement learning applications. This paper tackles an important issue in offline Imitation Learning and provides a viable solution.
8.6 Optimal Sequential Decision-Making in Geosteering: A Reinforcement Learning Approach
- Authors: Ressi Bonti Muhammad, Sergey Alyaev, Reidar Brumer Bratvold
- Reason: This work brings a novel perspective of applying reinforcement learning in geosteering and shows relevant results. It’s possible that, because of its novelty and relevance to the industry, this paper will be of importance in future studies.