- 8.7 Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games
- Authors: Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, Adam Wierman
- Reason: The paper presents key algorithmic innovation, has a theoretical grounding with established finite-sample bounds, and is authored by well-known researchers in the field.
- 8.3 Optimizing Distributed Reinforcement Learning with Reactor Model and Lingua Franca
- Authors: Jacky Kwok, Marten Lohstroh, Edward A. Lee
- Reason: Offers practical enhancement to distributed RL and has significant empirical performance improvements but slightly less theoretically novel compared to the first paper.
- 8.1 Pruning Convolutional Filters via Reinforcement Learning with Entropy Minimization
- Authors: Bogdan Musat, Razvan Andonie
- Reason: Introduces a novel reward function in RL for neural network pruning with significant computational savings, authored by experts but domain might be slightly niche.
- 7.9 Canaries and Whistles: Resilient Drone Communication Networks with (or without) Deep Reinforcement Learning
- Authors: Chris Hicks, Vasilios Mavroudis, Myles Foley, Thomas Davies, Kate Highnam, Tim Watson
- Reason: Explores RL for drone network resilience with practical implications, yet might be limited in influence due to a highly specific application focus.
- 7.6 Reinforcement Learning-Based Bionic Reflex Control for Anthropomorphic Robotic Grasping exploiting Domain Randomization
- Authors: Hirakjyoti Basumatary, Daksh Adhar, Atharva Shrawge, Prathamesh Kanbaskar, Shyamanta M. Hazarika
- Reason: Applies RL to robotic grasping with a novel approach but is a single study rather than a broad algorithmic contribution, hence its impact might be more confined to the domain of robotics.