9.4 Probabilistic Counterexample Guidance for Safer Reinforcement Learning
- Authors: Xiaotong Ji, Antonio Filieri
- Reason: This paper introduces a novel approach to safer exploration in reinforcement learning, guiding the training with counterexamples of the safety requirement. The effectiveness of the method is demonstrated in initial experiments, reducing safety violations. The paper is also accepted and evaluated at an International conference, indicating the high relevance of the research.
9.2 Empowering recommender systems using automatically generated Knowledge Graphs and Reinforcement Learning
- Authors: Ghanshyam Verma, Shovon Sengupta, Simon Simanta, Huan Chen, Janos A. Perge, Devishree Pillai, John P. McCrae, Paul Buitelaar
- Reason: The paper presents a combination of knowledge graph and reinforcement learning techniques to personalize delivery of financial articles to customers. The dual use of reinforcement learning and knowledge graph approach implies a greater sophistication and potential impact. Moreover, the paper is accepted for a popular conference KDD.
9.0 Measuring and Mitigating Interference in Reinforcement Learning
- Authors: Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White
- Reason: The paper provides a unique measure of interference in value-based reinforcement learning and proposes a procedure to mitigate it. The novel interference measure introduced can potentially help researchers navigate the prevalent problem of catastrophic interference in network-based learning systems.
8.8 Dynamics of Temporal Difference Reinforcement Learning
- Authors: Blake Bordelon, Paul Masset, Henry Kuo, Cengiz Pehlevan
- Reason: This paper provides theoretical insights into the dynamics of reinforcement learning models and the parameters that control them. It is potentially influential due to its emphasis on theoretical understanding, which could contribute to a more profound understanding of reinforcement learning.
8.6 Reinforcement Learning with Non-Cumulative Objective
- Authors: Wei Cui, Wei Yu
- Reason: This paper addresses reinforcement learning problems where objectives are not naturally expressed as summations of the rewards. By proposing modifications to existing algorithms and demonstrating their effectiveness, this paper could influence a more nuanced approach to reinforcement learning problems.