8.9 Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
- Authors: Huiying Zhong, Zhun Deng, Weijie J. Su, Zhiwei Steven Wu, Linjun Zhang
- Reason: Initiates a theoretical study in a novel field of multi-party RLHF, potential for high impact due to theoretical foundations and practical applications in aggregating diverse human preferences.
8.7 Overcoming Negative Transfer in Continual Reinforcement Learning
- Authors: Hongjoon Ahn, Jinu Hyeon, Youngmin Oh, Bosun Hwang, Taesup Moon
- Reason: Addresses a crucial challenge in CRL with a new effective method, validated through comprehensive experiments.
8.4 Simulating Battery-Powered TinyML Systems Optimised using Reinforcement Learning in Image-Based Anomaly Detection
- Authors: Jared M. Ping, Ken J. Nixon
- Reason: Contributes to the practical implications of reinforcement learning in optimizing energy consumption on IoT systems, an area of growing interest for smart industry solutions.
8.2 Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
- Authors: Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu
- Reason: Introduces a novel solution to a pervasive issue in RLHF, validated with experiments showing improved performance.
8.0 Switching the Loss Reduces the Cost in Batch Reinforcement Learning
- Authors: Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári
- Reason: Proposes a novel approach to batch RL with potential for practical impact in cost-sensitive environments.