9.7 Iterative Option Discovery for Planning, by Planning
- Authors: Kenny Young, Richard S. Sutton
- Reason: The paper presents an innovative approach to option discovery in reinforcement learning, drawing from the successful application of Expert Iteration in AlphaZero. It offers a significant improvement in challenging planning environments, indicating its substantial potential influence in machine learning and AI.
9.7 Learning quantum Hamiltonians at any temperature in polynomial time
- Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Ewin Tang
- Reason: The paper offers a potentially ground-breaking improvement in the area of quantum computing. The authors developed a method to learn a local quantum Hamiltonian at any constant temperature, previously a significant challenge in the field. They have provided a polynomial time algorithm for learning the Hamiltonian, which may significantly impact the speed and efficiency of quantum computations. Moreover, the authors are recognized authorities in the field, which adds further credibility to their work.
9.5 Solving the Quadratic Assignment Problem using Deep Reinforcement Learning
- Authors: Puneet S. Bagga, Arthur Delarue
- Reason: The paper addresses the challenge of solving the Quadratic Assignment Problem (a notoriously difficult NP-hard problem) using deep reinforcement learning. Its practical applications and outperformance of high-quality local search baseline make it potentially influential in the field.
9.5 Generalizable Long-Horizon Manipulations with Large Language Models
- Authors: Haoyu Zhou, Mingyu Ding, Weikun Peng, Masayoshi Tomizuka, Lin Shao, Chuang Gan
- Reason: This paper presents a framework for using Large Language Models (LLMs) to generate task conditions, useful for generalizable long-horizon manipulations. The method is tested in simulated and real-world environments, making it applicable to practical use cases. The framework could have broad implications for the AI community.
9.3 Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity
- Authors: Emmeran Johnson, Ciara Pike-Burke, Patrick Rebeschini
- Reason: This paper investigates the interplay between sample-efficiency and adaptivity in reinforcement learning. Its focus on dimension-dependent adaptivity and implications on policy evaluation could have significant influence on RL research.
9.2 Imitation Learning from Observation through Optimal Transport
- Authors: Wei-Di Chang, Scott Fujimoto, David Meger, Gregory Dudek
- Reason: The paper presents a novel approach to imitation learning, utilizing optimal transport and demonstrating superior performance in continuous control tasks. Its potential to revolutionise imitation learning from observation makes it an influencer in machine learning.
9.2 Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances
- Authors: Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar
- Reason: This paper addresses an important question of efficiency in scientific computing, specifically in setting solver parameters when solving multiple related linear systems. The authors present a bandit online learning algorithm proven to efficiently select parameters able to significantly reduce computation time. This could potentially improve numerous scientific computing operations.
9.1 Think before you speak: Training Language Models With Pause Tokens
- Authors: Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan
- Reason: This paper proposes an intriguing idea in language model training by introducing a learning mechanism of pause-training, which in principle could improve model outputs’ quality and correctness. This approach could prompt research into improving language model training processes, specifically dealing with computation speed and accuracy balance.
9.0 On Representation Complexity of Model-based and Model-free Reinforcement Learning
- Authors: Hanlin Zhu, Baihe Huang, Stuart Russell
- Reason: This work delves into the representation complexity of model-based and model-free reinforcement learning within the circuit complexity context. It provides a unique perspective on why model-based algorithms enjoy better sample complexity, offering new insights for machine learning research.
9.0 A Neural Scaling Law from Lottery Ticket Ensembling
- Authors: Ziming Liu, Max Tegmark
- Reason: This work explores the phenomenon of neural scaling laws, specifically in relation to the performance of lottery ticket ensembles within a network. The authors provide an alternative explanation to established scaling law theory. The research challenges traditional understanding of scaling in neural networks and may prompt a re-evaluation of current scaling law theories.