9.9 Combining Behaviors with the Successor Features Keyboard
- Authors: Wilka Carvalho, Andre Saraiva, Angelos Filos, Andrew Kyle Lampinen, Loic Matthey, Richard L. Lewis, Honglak Lee, Satinder Singh, Danilo J. Rezende, Daniel Zoran
- Reason: The paper proposes a novel method for transferring behavioral knowledge across tasks using discovered state-features and task encodings. It shows that the proposed method outperforms various transfer learning baselines for long-horizon tasks.
9.8 Human-in-the-Loop Task and Motion Planning for Imitation Learning
- Authors: Ajay Mandlekar, Caelan Garrett, Danfei Xu, Dieter Fox
- Reason: The approach proposed in this paper leverages the benefits of both imitation learning and Task and Motion Planning. The results indicate that proficient agents can be trained from just 10 minutes of non-expert teleoperation data.
9.7 Solving large flexible job shop scheduling instances by generating a diverse set of scheduling policies with deep reinforcement learning
- Authors: Imanol Echeverria, Maialen Murua, Roberto Santana
- Reason: This paper proposes a novel way of modeling flexible job shop scheduling problems as a Markov Decision Process, which potentially has a larger impact on real-world scenarios.
9.5 Recurrent Linear Transformers
- Authors: Subhojeet Pramanik, Esraa Elelimy, Marlos C. Machado, Adam White
- Reason: The paper introduces recurrent alternatives to the transformer self-attention mechanism that offer a context-independent inference cost, leverage long-range dependencies effectively, and perform well in practice. These alternatives might have a potential large influence on reinforcement learning problems where computational limitations make the application of transformers nearly infeasible.
9.1 A Doubly Robust Approach to Sparse Reinforcement Learning
- Authors: Wonyoung Kim, Garud Iyengar, Assaf Zeevi
- Reason: The novelty of the proposed algorithm and the comprehensive analysis of its regret demonstrates strong potential influence. The authors also overcame existing limitations in the sparse linear Markov decision process field. It’s also backed by both theoretical results and numerical experiments.
9.1 COPF: Continual Learning Human Preference through Optimal Policy Fitting
- Authors: Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu
- Reason: This paper proposes a new method for continual learning of human preferences in Reinforcement Learning from Human Feedback, addressing practical difficulties in real-world situations due to significant time and computational resources required, along with concerns related to data privacy.
8.9 Active teacher selection for reinforcement learning from human feedback
- Authors: Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell
- Reason: The novel framework and algorithm proposed could highly impact the field of reinforcement learning from human feedback. The authors applied their approach to real-world domains, showing its effectiveness and relevance.
8.7 Neural Multi-Objective Combinatorial Optimization with Diversity Enhancement
- Authors: Jinbiao Chen, Zizhen Zhang, Zhiguang Cao, Yaoxin Wu, Yining Ma, Te Ye, Jiahai Wang
- Reason: Multi-objective combinatorial optimization problems have great importance, and the authors propose a novel neural heuristic to address the problem with a greater focus on diversity. As it was also accepted at NeurIPS 2023, it can impact the machine learning community.
8.5 Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization
- Authors: Jinbiao Chen, Jiahai Wang, Zizhen Zhang, Zhiguang Cao, Te Ye, Siyuan Chen
- Reason: The authors address a significant issue of high learning efficiency and solution quality in solving multi-objective combinatorial optimization problems. They demonstrate the efficiency of their proposed heuristic in terms of solution quality and learning efficiency.
8.2 Application of deep and reinforcement learning to boundary control problems
- Authors: Zenin Easa Panthakkalakath, Juraj Kardoš, Olaf Schenk
- Reason: The authors explore the potential of using deep learning and reinforcement learning in solving boundary control problems. The application domain mentioned in the paper indicates a high potential influence if these methods prove effective in one of the mentioned fields.