9.5 Aligning Agent Policy with Externalities: Reward Design via Bilevel RL
- Authors: Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Dinesh Manocha, Huazheng Wang, Furong Huang, Mengdi Wang
- The paper presents a new approach to align reinforcement learning policies with broad system objectives, utilizing a bilevel optimization problem. It provides a strong theoretical foundation to the approach by mathematical proofs and backs the analysis with various practical examples.
9.5 Provably Efficient Learning in Partially Observable Contextual Bandit
- Authors: Xueping Gong, Jiheng Zhang
- Reason: Delivers a novel approach in the field of transfer learning and partially observable contextual bandits by synthesizing causal bounds and classical algorithms. This has potential to enhance performance in real-use applications.
9.2 SMARLA: A Safety Monitoring Approach for Deep Reinforcement Learning Agents
- Authors: Amirhossein Zolfagharian, Manel Abdellatif, Lionel C. Briand, Ramesh S
- The paper proposes a machine learning-based safety monitoring approach for DRL agents, which is critical for safety-critical applications. The approach demonstrates promising empirical results in terms of accurate violation predictions and early-warning capability.
9.2 Generalized Early Stopping in Evolutionary Direct Policy Search
- Authors: Etor Arza, Leni K. Le Goff, Emma Hart
- Reason: Presents a unique early stopping method for direct policy search, demonstrating utilization of computation time savings with competitive performance, broad applicability across domains.
9.0 Doubly Robust Estimator for Off-Policy Evaluation with Large Action Spaces
- Authors: Tatsuhiro Shimizu
- Reason: Addresses a significant hurdle in machine learning: bias-variance trade-offs in estimators under large action spaces. The proposed estimator aims to mitigate these issues, presenting potential advancements in off-policy estimation.
8.9 qgym: A Gym for Training and Benchmarking RL-Based Quantum Compilation
- Authors: Stan van der Linde, Willem de Kok, Tariq Bontekoe, Sebastian Feld
- The paper presents a software framework, specifically designed for quantum compilation, which uses reinforcement learning. It attempts to bridge the gap between AI and quantum compilation, pushing forward in the field of quantum computers.
8.9 Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experience
- Authors: A. Emin Orhan
- Reason: This paper represents a major step forward in the understanding of self-supervised learning by proposing the scaling of data size, model size, and image resolution simultaneously, which could lead to human-level object recognition.
8.7 Vehicles Control: Collision Avoidance using Federated Deep Reinforcement Learning
- Authors: Badr Ben Elallid, Amine Abouaomar, Nabil Benamar, Abdellatif Kobbane
- The paper offers a comprehensive study on using Federated Deep Reinforcement Learning for vehicles control, aiming for collision avoidance. It shows the effectiveness of the proposed method through comparative analysis, demonstrating promise in real-world applications.
8.7 When Federated Learning meets Watermarking: A Comprehensive Overview of Techniques for Intellectual Property Protection
- Authors: Mohammed Lansari, Reda Bellafqira, Katarzyna Kapusta, Vincent Thouvenot, Olivier Bettan, Gouenou Coatrieux
- Reason: A comprehensive overview of watermarking techniques in federated learning, contributing to the development of an important aspect of AI – intellectual property protection.
8.3 Reinforcement Learning for Financial Index Tracking
- Authors: Xianhua Peng, Chenyin Gong, Xue Dong He
- The paper proposes a novel discrete-time infinite-horizon dynamic formulation of the financial index tracking problem. They use an extension of deep reinforcement learning method to solve this problem, and demonstrate the effectiveness of their approach through empirical study.