9.9 Contrastive Initial State Buffer for Reinforcement Learning
- Authors: Nico Messikommer, Yunlong Song, Davide Scaramuzza
- Reason: This paper presents an innovative approach in reinforcement learning, introducing a Contrastive Initial State Buffer that reuses past experiences for data collection, potentially offering considerable enhancements in reinforcement learning applications.
9.5 Deep Reinforcement Learning for the Joint Control of Traffic Light Signaling and Vehicle Speed Advice
- Authors: Johannes V. S. Busch, Robert Voelckner, Peter Sossalla, Christian L. Vielhaus, Roberto Calandra, Frank H. P. Fitzek
- Reason: This research focuses on joint control for traffic and vehicle speed, an innovative subject in the intersection of urban planning and machine learning, envisioning potential improvements in urban traffic systems.
9.3 Learning Optimal Contracts: How to Exploit Small Action Spaces
- Authors: Francesco Bacchiocchi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti
- Reason: This paper addresses the practical problem of principal-agent interactions, proposing an algorithm for learning nearly optimal contracts, which may make significant contributions to the application of machine learning in economics and decision theory.
9.1 Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles
- Authors: Noah Golowich, Dhruv Rohatgi, Ankur Moitra
- The paper presents a significant contribution to reinforcement learning, especially linear Markov Decision Processes (MDPs). It introduces a novel concept of an emulator, a succinct approximate representation of the transitions, and suggested an algorithm for this problem. It also contributes to advancing computational learning theory.
9.0 Mechanic Maker 2.0: Reinforcement Learning for Evaluating Generated Rules
- Authors: Johor Jara Gonzalez, Seth Cooper, Mathew Guzdial
- Reason: A valuable contribution to research in Automated Game Design with the application of Reinforcement Learning as an approximator for human play in rule generation.
8.8 Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
- Authors: Jinning Li, Xinyi Liu, Banghua Zhu, Jiantao Jiao, Masayoshi Tomizuka, Chen Tang, Wei Zhan
- This paper proposed a reinforcement learning method that can learn safely by extracting an expert policy from offline data to guide online exploration.
8.6 Wait, That Feels Familiar: Learning to Extrapolate Human Preferences for Preference Aligned Path Planning
- Authors: Haresh Karnan, Elvin Yang, Garrett Warnell, Joydeep Biswas, Peter Stone
- Reason: The work introduces a novel framework that extrapolates operator terrain preferences, a challenging problem in robot navigation in diverse terrain and varying lighting conditions.
8.5 Projected Task-Specific Layers for Multi-Task Reinforcement Learning
- Authors: Josselin Somerville Roberts, Julia Di
- The authors introduce Projected Task-Specific Layers, an architecture that allows robots, for example, to generalize and mitigate negative task interference in multi-task reinforcement learning. This has potential significant implications for AI and robotics.
8.2 DOMAIN: Mildly Conservative Model-Based Offline Reinforcement Learning
- Authors: Xiao-Yin Liu, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou
- The paper addresses an essential problem in reinforcement learning - the distribution shift in offline reinforcement learning. It proposed a novel solution called DOMAIN that has theoretical security policy improvement guarantee and outperforms other RL algorithms.
7.9 Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback
- Authors: Rustam Zayanov, Francisco S. Melo, Manuel Lopes
- Although it scores lower potentially because it’s an interactive teaching aspect of reinforcement learning, it still presents a valuable contribution by providing a teaching method that can work even when the learner’s feedback is limited.