8.9 Policy Optimization finds Nash Equilibrium in Regularized General-Sum LQ Games
- Authors: Muhammad Aneeq uz Zaman, Shubham Aggarwal, Melih Bastopcu, Tamer Başar
- Reason: The authors propose a novel method with proved convergence in policy optimization for Nash Equilibrium in regularized games, which could influence reinforcement learning strategies in multi-agent systems.
8.9 RL-MUL: Multiplier Design Optimization with Deep Reinforcement Learning
- Authors: Dongsheng Zuo, Jiadong Zhu, Yikang Ouyang, Yuzhe Ma
- Reason: Authors seem to address a problem of practical and broad impact in computational hardware, leveraging RL for optimization, and the paper is an extension of work presented at a reputable conference (DAC 2023).
8.7 Efficient Automatic Tuning for Data-driven Model Predictive Control via Meta-Learning
- Authors: Baoyu Li, William Edwards, Kris Hauser
- Reason: The paper presents an innovative application of meta-learning to improve the efficiency of automatic tuning in model predictive control, which is likely to have significant impact on real-world system optimization and control tasks.
8.7 Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
- Authors: Srinjoy Roy, Swagatam Das
- Reason: The paper tackles the fundamental issue of uncertainty in RL, which is related to exploration-exploitation tradeoff, and demonstrates competitive performance in Atari games, hinting at significant influence given that benchmarks are well-known within the RL community.
8.6 Solving the QAP by Two-Stage Graph Pointer Networks and Reinforcement Learning
- Authors: Satoko Iida, Ryota Yasudo
- Reason: Addresses the NP-hard QAP problem which is a core challenge in optimization, and adapts a model from Euclidean TSP domain showing versatility and potential influence in solving different combinatorial optimization problems using RL.
8.5 Multiple-policy Evaluation via Density Estimation
- Authors: Yilei Chen, Aldo Pacchiano, Ioannis Ch. Paschalidis
- Reason: This work addresses the complex problem of evaluating multiple policies in reinforcement learning and proposes a novel algorithm, which could be pivotal in multi-policy decision-making processes.
8.4 Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance
- Authors: Thomas Nakken Larsen, Eirik Runde Barlaug, Adil Rasheed
- Reason: Integrating Deep RL with Variational Autoencoders for the specific application in maritime control systems, which implicates a direct influence in the domain of autonomous marine applications.
8.3 Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods
- Authors: Yuji Cao, Huan Zhao, Yuheng Cheng, Ting Shu, Guolong Liu, Gaoqi Liang, Junhua Zhao, Yun Li
- Reason: This comprehensive survey could guide future research in the promising area of integrating large language models with reinforcement learning, impacting various domains from robotics to multi-task learning.
8.2 Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration
- Authors: Yibo Wang, Jiang Zhao
- Reason: Exploration in RL is a challenging aspect, and the authors present a method to address this while improving sample-efficiency. The paper shows potential because it deals with online planning and model uncertainty for continuous control, pivotal areas within modern RL research.
8.1 Exploring Adaptive MCTS with TD Learning in miniXCOM
- Authors: Kimiya Saadat, Richard Zhao
- Reason: The paper presents an adaptive approach to Monte Carlo Tree Search improved with temporal difference learning, applied to the game environment, which may influence game AI and general reinforcement learning methodologies.