- 9.6 Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization
- Authors: Jinxin Liu, Hongyin Zhang, Zifeng Zhuang, Yachen Kang, Donglin Wang, Bin Wang
- Reason: This paper tackles an essential challenge in reinforcement learning, offline training. It introduces a non-iterative bi-level paradigm to avoid the iterative error propagation over two levels. The proposed method, DROP, has shown promising empirical results, demonstrating a superior performance compared to prior non-iterative offline RL counterparts.
- 9.5 Taming the Exponential Action Set: Sublinear Regret and Fast Convergence to Nash Equilibrium in Online Congestion Games
- Authors: Jing Dong, Jingyu Wu, Siwei Wang, Baoxiang Wang, Wei Chen
- The paper makes significant contributions in the congestion game model, a significant model in resource allocation and distribution. The authors are reputable in the field of game theory and machine learning.
- 9.3 Curvature-enhanced Graph Convolutional Network for Biomolecular Interaction Prediction
- Authors: Cong Shen, Pingjian Ding, Junjie Wee, Jialin Bi, Jiawei Luo, Kelin Xia
- This paper presents a novel method using graph convolutional networks for predicting biomolecular interactions. The authors’ established track record in geometric deep learning enhances its weight.
- 9.3 A General Framework for Sequential Decision-Making under Adaptivity Constraints
- Authors: Nuoya Xiong, Zhuoran Yang, Zhaoran Wang
- Reason: This paper provides a novel and broad framework for reinforcement learning under adaptivity constraints. Theoretical results and the wide range of models studied in this work indicate a potentially significant contribution to the field.
- 9.1 A First Order Meta Stackelberg Method for Robust Federated Learning
- Authors: Yunian Pan, Tao Li, Henger Li, Tianyi Xu, Zizhan Zheng, Quanyan Zhu
- This work studies adversarial federated learning and proposes a novel method to defend against a range of potential threats. The authors have substantial experience in federated learning and adversarial learning.
- 9.1 PolicyClusterGCN: Identifying Efficient Clusters for Training Graph Convolutional Networks
- Authors: Saket Gurukar, Shaileshh Bojja Venkatakrishnan, Balaraman Ravindran, Srinivasan Parthasarathy
- Reason: PolicyClusterGCN proposes a novel method for optimizing graph convolutional networks (GCNs) using reinforcement learning. It uses a Markov Decision Process (MDP) formulation to improve GCN performance and has shown superior results on real-world and synthetic datasets.
- 9.1 Estimating player completion rate in mobile puzzle games using reinforcement learning
- Authors: Jeppe Theiss Kristensen, Arturo Valdivia, Paolo Burelli
- Reason: The authors’ work offers practical insights into the application of reinforcement learning in the context of mobile gaming. The mix of data analysis and use of reinforcement learning is highly influential and has broad use cases.
- 9.0 Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
- Authors: Maxime Chevalier-Boisvert, Bolun Dai, Mark Towers, Rodrigo de Lazcano, Lucas Willems, Salem Lahlou, Suman Pal, Pablo Samuel Castro, Jordan Terry
- This paper presents libraries of 2D and 3D environments for reinforcement learning. Given the increasing importance of custom environments in the RL community, and the prominence of the authors’ affiliations, this work is expected to have an extensive impact.
- 8.9 Large Sequence Models for Sequential Decision-Making: A Survey
- Authors: Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang
- Reason: This survey presents an exhaustive review of the effective use of large sequence models like Transformers in decision-making tasks. Given the authors’ authoritative background and the detailed, yet comprehensive nature of the survey, the paper is likely to be influential.
- 8.9 STEF-DHNet: Spatiotemporal External Factors Based Deep Hybrid Network for Enhanced Long-Term Taxi Demand Prediction
- Authors: Sheraz Hassan, Muhammad Tahir, Momin Uppal, Zubair Khalid, Ivan Gorban, Selim Turki
- Reason: STEF-DHNet introduces a new model for predicting demand for ride-hailing services using Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). The method provides a way to incorporate external factors into the prediction. Also, it performance well over long periods without retraining.
- 8.8 Estimating the Value of Evidence-Based Decision Making
- Authors: Alberto Abadie, Anish Agarwal, Guido Imbens, Siwei Jia, James McQueen, Serguei Stepaniants
- This paper examines the value of evidence-based decision making, a topic of significant importance for both industry and academia. The included authors are well respected in the field of statistics and machine learning.
- 8.7 Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query
- Authors: Hidenori Itaya, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura
- Reason: This paper presents innovative work on applying Transformer models to reinforcement learning, aiming to provide more interpretability towards the decision-making of RL agents which is especially relevant given the black box nature of deep learning systems.
- 8.6 Near Optimal Heteroscedastic Regression with Symbiotic Learning
- Authors: Dheeraj Baby, Aniket Das, Dheeraj Nagaraj, Praneeth Netrapalli
- Reason: This paper makes significant theoretical contributions to heteroscedastic linear regression, a crucial issue in various fields such as statistics, econometrics, and machine learning. The authors proposed an effective solution improving on the previous best known upper bound. However, this paper does not strictly fall in the scope of reinforcement learning.
- 8.6 On Imitation in Mean-field Games
- Authors: Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist
- Reason: This paper introduces a new concept in the challenging field of imitation learning applied to mean-field games. By proposing a novel solution concept and considering conditions beyond conventional imitation learning, it could inspire further studies and advancements.
- 8.5 Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery
- Authors: Xiao Zhang, Hai Zhang, Hongtu Zhou, Chang Huang, Di Zhang, Chen Ye, Junqiao Zhao
- Reason: This safe reinforcement learning method addresses the challenge of striking a balance between ensuring safety during and after the training process and pushing for exploration, which could prove pivotal given the application of RL to realistic scenarios.
- 8.3 Decision-Dependent Distributionally Robust Markov Decision Process Method in Dynamic Epidemic Control
- Authors: Jun Song, William Yang, Chaoyue Zhao
- Reason: This paper proposes a novel approach for addressing the dynamic epidemic control problem which leverages robust Markov Decision Processes and is particularly relevant given the ongoing COVID-19 pandemic.
- 8.3 Multi-Agent Deep Reinforcement Learning for Dynamic Avatar Migration in AIoT-enabled Vehicular Metaverses with Trajectory Prediction
- Authors: Junlong Chen, Jiawen Kang, Minrui Xu, Zehui Xiong, Dusit Niyato, Chuan Chen, Abbas Jamalipour, Shengli Xie
- Reason: This study explores the usage of multi-agent deep reinforcement learning within the emerging field of AIoT-enabled metaverses. The paper skillfully addresses complex problems and can influence researchers in the field of Internet of Things and virtual reality.
- 8.1 Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching
- Authors: H.J. Terry Suh, Glen Chou, Hongkai Dai, Lujie Yang, Abhishek Gupta, Russ Tedrake
- Reason: Proposing a novel planning algorithm for offline RL that utilizes score-matching to enable first-order planning in high-dimensional problems, where previous methods fell short, could pave the way for improved performance in this realm of RL.
- 7.8 Learning to Modulate pre-trained Models in RL
- Authors: Thomas Schmied, Markus Hofmarcher, Fabian Paischer, Razvan Pascanu, Sepp Hochreiter
- Reason: The authors present a new method to alleviate the problem of catastrophic forgetting in the context of fine-tuning pretrained models in deep reinforcement learning. This could have a significant impact on multi-task learning.
- 7.3 Maximum State Entropy Exploration using Predecessor and Successor Representations
- Authors: Arnav Kumar Jain, Lucas Lehnert, Irina Rish, Glen Berseth
- Reason: The authors propose a method to learn exploration policies more efficiently through an entropy-based approach. This work could serve as a reference to researchers working on exploration in reinforcement learning.