9.3 Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
- Authors: Meshal Alharbi, Mardavij Roozbehani, Munther Dahleh
- Reason: Accepted at a prestigious conference (AAAI) and advances the foundational issue of sample efficiency in RL by incorporating dynamics knowledge, which is a critical aspect of RL research.
9.2 Leading the Pack: N-player Opponent Shaping
- Authors: Alexandra Souly, Timon Willi, Akbir Khan, Robert Kirk, Chris Lu, Edward Grefenstette, Tim Rocktäschel
- Reason: Extends Opponent Shaping (OS) to N-player games and comes from authors affiliated with reputable institutions, indicating potential high-quality insights into multi-agent interactions.
9.0 Model-Based Control with Sparse Neural Dynamics
- Authors: Ziang Liu, Genggeng Zhou, Jeff He, Tobia Marcucci, Li Fei-Fei, Jiajun Wu, Yunzhu Li
- Reason: Accepted at NeurIPS (a top conference), introduces a novel framework integrating model learning and predictive control, and includes authors with high authority in the field of AI and machine learning.
8.9 BadRL: Sparse Targeted Backdoor Attack Against Reinforcement Learning
- Authors: Jing Cui, Yufei Han, Yuzhe Ma, Jianbin Jiao, Junge Zhang
- Reason: Extended version accepted by AAAI, addresses the important issue of security in RL, an increasingly relevant topic given the rising application of RL in critical domains.
8.7 Robustly Improving Bandit Algorithms with Confounded and Selection Biased Offline Data: A Causal Approach
- Authors: Wen Huang, Xintao Wu
- Reason: Provides a novel causal perspective on bandit problems utilizing offline data, a significant and practical problem in RL, especially for real-world applications that rely on historical data for decision making.