8.7 Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games
- Authors: Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, Adam Wierman
- Reason: The paper presents key algorithmic innovation, has a theoretical grounding with established finite-sample bounds, and is authored by well-known researchers in the field.
8.3 Optimizing Distributed Reinforcement Learning with Reactor Model and Lingua Franca
- Authors: Jacky Kwok, Marten Lohstroh, Edward A. Lee
- Reason: Offers practical enhancement to distributed RL and has significant empirical performance improvements but slightly less theoretically novel compared to the first paper.
8.1 Pruning Convolutional Filters via Reinforcement Learning with Entropy Minimization
- Authors: Bogdan Musat, Razvan Andonie
- Reason: Introduces a novel reward function in RL for neural network pruning with significant computational savings, authored by experts but domain might be slightly niche.
7.9 Canaries and Whistles: Resilient Drone Communication Networks with (or without) Deep Reinforcement Learning
- Authors: Chris Hicks, Vasilios Mavroudis, Myles Foley, Thomas Davies, Kate Highnam, Tim Watson
- Reason: Explores RL for drone network resilience with practical implications, yet might be limited in influence due to a highly specific application focus.
7.6 Reinforcement Learning-Based Bionic Reflex Control for Anthropomorphic Robotic Grasping exploiting Domain Randomization
- Authors: Hirakjyoti Basumatary, Daksh Adhar, Atharva Shrawge, Prathamesh Kanbaskar, Shyamanta M. Hazarika
- Reason: Applies RL to robotic grasping with a novel approach but is a single study rather than a broad algorithmic contribution, hence its impact might be more confined to the domain of robotics.