- 9.5 Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed Cooperative-Competitive Games
- Authors: Zelai Xu, Yancheng Liang, Chao Yu, Yu Wang, Yi Wu
- Reason: This paper provides a novel algorithm for competitive games that includes a new framework for Nash equilibrium learning, which holds significant implications in multi-agent reinforcement settings for mixed cooperative-competitive games.
- 9.4 RUSOpt: Robotic UltraSound Probe Normalization with Bayesian Optimization for In-plane and Out-plane Scanning
- Authors: Deepak Raina, Abhishek Mathur, Richard M. Voyles, Juan Wachs, SH Chandrashekhara, Subir Kumar Saha
- Reason: This paper introduces a method to improve the orientation of robotic ultrasound probes to enhance the quality of image acquisition, which is highly significant in medical robotics field.
- 9.3 Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms
- Authors: Akifumi Wachi, Wataru Hashimoto, Xun Shen, Kazumune Hashimoto
- Reason: This paper focuses on a critical issue in real-world RL applications - safe exploration. It addresses this through a generalized formulation and proposes a meta-algorithm for safe exploration while optimizing the policy, supported by test cases and comparisons to existing algorithms. This paper could have a significant impact considering the wide application of RL and the importance of safety.
- 9.3 LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework
- Authors: Woojun Kim, Jeonghye Kim, Youngchul Sung
- Reason: This paper offers an innovative framework for reinforcement learning that can adaptively select the most effective exploration strategy, which is particularly useful for tackling the exploration-exploitation dilemma in RL.
- 9.2 How the level sampling process impacts zero-shot generalisation in deep reinforcement learning
- Authors: Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht
- Reason: This paper involves a detailed investigation into how level sampling influences the generalization ability of RL agents, addressing a key issue in the operability of RL models.
- 9.1 Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
- Authors: Yihang Yao, Zuxin Liu, Zhepeng Cen, Jiacheng Zhu, Wenhao Yu, Tingnan Zhang, Ding Zhao
- Reason: Introducing an approach to achieve versatile safe reinforcement learning, this paper demonstrates a significant contribution improving the safety and performance of RL agents in varying constraint thresholds data-efficiently.
- 8.9 Deep reinforcement learning for machine scheduling: Methodology, the state-of-the-art, and future directions
- Authors: Maziyar Khadivi, Todd Charter, Marjan Yaghoubi, Masoud Jalayer, Maryam Ahang, Ardeshir Shojaeinasab, Homayoun Najjaran
- Reason: The paper covers an important combinatorial problem - machine scheduling, which is crucial in manufacturing. Applying DRL to this problem has significant potential benefits and this comprehensive review, comparing DRL-based approaches and discussing potential future directions, might be highly influential for researchers and practitioners in this field.
- 8.7 A Deep Reinforcement Learning Approach for Interactive Search with Sentence-level Feedback
- Authors: Jianghong Zhou, Joyce C. Ho, Chen Lin, Eugene Agichtein
- Reason: The authors propose a deep Q-learning approach to interactive search, considering sentence-level feedback. As searches often fail to capture user intent accurately, this paper’s focus on fine-grained feedback and natural language processing could have a considerable influence on interactive search systems and RL applications.
- 8.5 Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
- Authors: Sara Klein, Simon Weissmann, Leif Döring
- Reason: The paper introduces a dynamic policy gradient for Markov Decision Processes, which could advance the field by offering an alternative to the commonly used stationary policies and improving convergence bounds.
- 8.1 Neural architecture impact on identifying temporally extended Reinforcement Learning tasks
- Authors: Victor Vadakechirayath George
- Reason: The exploration of different attention-based architectures for reinforcement learning on the Atari-2600 game suite could spark further research and developments in this area.