9.7 Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
- Authors: Jinyi Liu, Yi Ma, Jianye Hao, Yujing Hu, Yan Zheng, Tangjie Lv, Changjie Fan
- Reason: The paper presents a unique approach to offline Reinforcement Learning (RL), expanding on the importance of data sampling for enhancing efficiency. This is especially crucial in the domain of RL which is seeing fast-paced developments.
9.5 Beyond dynamic programming
- Authors: Abhinav Muraleedharan
- Reason: The paper makes significant strides in reinforcement learning by introducing a new theoretical approach - Score-life programming. The author brings forward a novel means to search over non-stationary policy functions, optimizing action sequences, and increasing the performance of reinforcement learning algorithms.
9.4 Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning
- Authors: Sherly Alfonso-Sánchez, Jesús Solano, Alejandro Correa-Bahnsen, Kristina P. Sendova, Cristián Bravo
- Reason: This work extends the RL framework to the banking sector—a domain that remains predominantly uncharted. The novelty of this research combined with the field of applications could follow significant influence.
9.3 BatchGFN: Generative Flow Networks for Batch Active Learning
- Authors: Shreshth A. Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal
- Reason: A breakthrough in active learning with a solid research team including Yoshua Bengio suggests high influence. BatchGFN, the introduced approach, can efficiently construct highly informative batches for active learning.
9.2 Learning to Sail Dynamic Networks: The MARLIN Reinforcement Learning Framework for Congestion Control in Tactical Environments
- Authors: Raffaele Galliera, Mattia Zaccarini, Alessandro Morelli, Roberto Fronteddu, Filippo Poltronieri, Niranjan Suri, Mauro Tortonesi
- Reason: The authors propose an interesting RL framework that addresses the challenging issue of congestion control in tactical environments. Despite its narrow focus, the results are promising and may lead to considerable influence in related areas.
9.1 Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression
- Authors: Allan Raventós, Mansheej Paul, Feng Chen, Surya Ganguli
- Reason: The paper is significant in providing an analysis of in-context learning’s performance on linear regression. Their findings on how transformers can solve completely new tasks and the importance of task diversity expand the understanding of transformer models in machine learning.
9.0 Value-aware Importance Weighting for Off-policy Reinforcement Learning
- Authors: Kristopher De Asis, Eric Graves, Richard S. Sutton
- Reason: Focusing on a rather technical but integral aspect of RL, importance sampling, this paper offers an approach aimed at reducing variance while maintaining unbiased estimates. Although more abstract, it is crucial in advancing RL’s foundational theory.
8.9 Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
- Authors: Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito
- Reason: The paper tackles the challenges of evaluating ranking policies using logged data. With a newly proposed Adaptive IPS estimator, it is capable of unbiased evaluations under complex user behavior, which is a significant advancement in the field.
8.8 Learning non-Markovian Decision-Making from State-only Sequences
- Authors: Aoyang Qin, Feng Gao, Qing Li, Song-Chun Zhu, Sirui Xie
- Reason: This research is noteworthy by bringing a new perspective to imitation learning in handling state-only sequences with a non-Markovian decision process. It introduces a model that can more accurately simulate real-world decision processes.
8.6 Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition
- Authors: Tianzi Wang, Shoukang Hu, Jiajun Deng, Zengrui Jin, Mengzhe Geng, Yi Wang, Helen Meng, Xunying Liu
- Reason: Although not entirely focused on RL, this research provides interesting insights into the application of machine learning for recognizing disordered and elderly speech—a useful and impactful area of study. However, it might be less influential within the strict domain of RL.