9.5 Approximate Model-Based Shielding for Safe Reinforcement Learning
- Authors: Alexander W. Goodall, Francesco Belardinelli
- Reason: The paper proposes a model-based shielding algorithm that helps to verify the performance of learned RL policies with respect to given safety constraints. Considering its acceptance at ECAI 2023 and the relevance of the safety in RL, this work is expected to influence the field significantly.
9.3 PeRP: Personalized Residual Policies For Congestion Mitigation Through Co-operative Advisory Systems
- Authors: Aamir Hasan, Neeloy Chakraborty, Haonan Chen, Jung-Hoon Cho, Cathy Wu, Katherine Driggs-Campbell
- Reason: This work introduces co-operative advisory system called PeRP aimed at mitigating traffic congestion considering differences in driver behaviors. Due to the applicability in real-world scenarios, it is expected to have a high impact on the field.
9.2 Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning
- Authors: Haorui Li, Jiaqi Liang, Linjing Li, Daniel Zeng
- Reason: The proposed regularizer enhances the diversity of subpolicies in hierarchical RL, tackling a known challenge in the domain. Experimental results indicate high performance and sample efficiency, showing the potential for high influence.
9.0 Direct Gradient Temporal Difference Learning
- Authors: Xiaochi Qian, Shangtong Zhang
- Reason: The paper proposes a direct method to solve the double sampling issue in gradient temporal difference learning, solving a notorious instability issue in RL. With strong theoretical backing, this paper could have lasting impact in the field.
8.7 BiERL: A Meta Evolutionary Reinforcement Learning Framework via Bilevel Optimization
- Authors: Junyi Wang, Yuanyang Zhu, Zhi Wang, Yan Zheng, Jianye Hao, Chunlin Chen
- Reason: Proposing a bilevel optimization framework to fine-tune the hyperparameters parallel to the main ERL model learning, this work provides an elegant solution to a known ERL problem, showing promise for high influence in the ERL community.