- 9.7 Adaptive reinforcement learning of multi-agent ethically-aligned behaviours: the QSOM and QDSOM algorithms
- Authors: Rémy Chaput, Olivier Boissier, Mathieu Guillermin
- Reason: The paper introduces novel algorithms that adapt to changes in the environment and in ethical considerations that affect AI systems. The proposed methods can handle multi-dimensional and continuous state and action spaces, and offer higher performance compared to baseline Reinforcement Learning algorithms.
- 9.5 Is Risk-Sensitive Reinforcement Learning Properly Resolved?
- Authors: Ruiwen Zhou, Minghuan Liu, Kan Ren, Xufang Luo, Weinan Zhang, Dongsheng Li
- Reason: A significant study identifying a potential issue with Risk-Sensitive reinforcement learning methods. Authors propose a novel algorithm that offers a solution, showing promising results in risk-sensitive policy learning.
- 9.3 GA-DRL: Graph Neural Network-Augmented Deep Reinforcement Learning for DAG Task Scheduling over Dynamic Vehicular Clouds
- Authors: Zhang Liu, Lianfen Huang, Zhibin Gao, Manman Luo, Seyyedali Hosseinalipour, Huaiyu Dai
- Reason: This paper presents a novel solution to DAG task scheduling over dynamic vehicular clouds. The authors use a combination of GAT and deep reinforcement learning, which showcases superior performance compared to existing benchmarks.
- 9.2 Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control
- Authors: Vivek Myers, Andre He, Kuan Fang, Homer Walke, Philippe Hansen-Estruch, Ching-An Cheng, Mihai Jalobeanu, Andrey Kolobov, Anca Dragan, Sergey Levine
- Reason: Significant research team with expertise in the field and groundbreaking approach to teaching robots natural language instructions. The fact that the team developed a method to use a small amount of language data to effectively allow robots to perform complex real-world tasks significantly means that this paper is very influential in the field of machine learning.
- 8.9 How Do Human Users Teach a Continual Learning Robot in Repeated Interactions?
- Authors: Ali Ayub, Jainish Mehta, Zachary De Francesco, Patrick Holthaus, Kerstin Dautenhahn, Chrystopher L. Nehaniv
- Reason: In-depth study on continual learning robots and interactions with human users. The identification of differences in teaching styles amongst users and the implications of such differences on the performance of the robot provides invaluable insight on machine learning algorithms.
- 8.9 Collaborative Policy Learning for Dynamic Scheduling Tasks in Cloud-Edge-Terminal IoT Networks Using Federated Reinforcement Learning
- Authors: Do-Yup Kim, Da-Eun Lee, Ji-Wan Kim, Hyun-Suk Lee
- Reason: It proposes a novel collaborative policy learning framework for dynamic scheduling tasks that adaptively selects tasks for collaborative learning in each round and thus speeds up the policy learning process.
- 8.7 Monte Carlo Policy Gradient Method for Binary Optimization
- Authors: Cheng Chen, Ruitao Chen, Tianyou Li, Ruichen Ao, Zaiwen Wen
- Reason: The paper introduces a unique probabilistic model for binary optimization problems and shows promising results for these NP-hard problems with a novel method derived from reinforcement learning.
- 8.6 RObotic MAnipulation Network (ROMAN) – Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
- Authors: Eleftherios Triantafyllidis, Fernando Acero, Zhaocheng Liu, Zhibin Li
- Reason: Groundbreaking hybrid hierarchical learning framework developed for long-sequential complex tasks in robotic manipulation. The developed Robotic Manipulation Network (ROMAN) exhibits potential for various autonomous manipulation tasks that demand adaptive motor skills which makes this work distinctly influential.
- 8.3 Risk-sensitive Actor-free Policy via Convex Optimization
- Authors: Ruoqi Zhang, Jens Sjölund
- Reason: Presents an optimal actor-free policy that optimizes a risk-sensitive criterion which is a novel development in reinforcement learning. Experimental results demonstrate the efficacy of this approach in maintaining risk control which has significant implications in automated decision making.
- 8.1 Hiding in Plain Sight: Differential Privacy Noise Exploitation for Evasion-resilient Localized Poisoning Attacks in Multiagent Reinforcement Learning
- Authors: Md Tamjid Hossain, Hung La
- Reason: Novel and groundbreaking work on the potential vulnerability of differential privacy mechanisms in multiagent reinforcement learning. The developed adaptive attack in the face of inherent privacy-noise opens up a new area of research in cybersecurity and machine learning interaction.