9.9 Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
- Authors: Nate Rahn, Pierluca D’Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare
- Reason: Accepted at NeurIPS 2023, hosted by Google Brain Montreal & Google Research Montreal. Introduces a distribution-aware optimization procedure for reinforcement learning (RL). This could have significant influence given the high visibility of the conference, and the reputation of Google Brain.
9.8 An AI Chatbot for Explaining Deep Reinforcement Learning Decisions of Service-oriented Systems
- Authors: Andreas Metzger, Jone Bartel, Jan Laufer
- Reason: The authors are known for their work on AI Chatbot technology. This paper tackles a significant challenge in reinforcement learning, and the system it proposes could make these complex algorithms more accessible to non-technical users. The fact that it’s scheduled to be published at a leading conference also indicates likely influence.
9.8 CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
- Authors: Rakshith Sharma Srinivasa, Jaejin Cho, Chouchang Yang, Yashas Malur Saidutta, Ching-Hua Lee, Yilin Shen, Hongxia Jin
- Reason: Accepted to NeurIPS 2023, the team from Samsung AI contributes a novel loss function for cross-modal contrastive learning. This paper further develops cross-modal learning, an essential part of RL. Being published at NeurIPS increases its likely influence.
9.7 Recurrent Hypernetworks are Surprisingly Strong in Meta-RL
- Authors: Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson
- Reason: Published at NeurIPS 2023, this paper investigates the use of hypernetworks in deep RL. This could influence future research direction due to its surprising and strong results achieved by a simpler recurrent baseline.
9.6 Age Minimization in Massive IoT via UAV Swarm: A Multi-agent Reinforcement Learning Approach
- Authors: Eslam Eldeeb, Mohammad Shehab, Hirley Alves
- Reason: It applies multi-agent deep RL to solve high-dimensional problem arisen when using a swarm of UAVs to collect fresh information from IoT devices. The real-world application may induce more researches founded on this methodology.
9.5 Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement Learning Techniques
- Authors: Emanuel Figetakis, Yahuza Bello, Ahmed Refaey, Lei Lei, Medhat Moussa
- Reason: This paper proposes a novel solution to traffic congestion - a major global problem. Use of advanced reinforcement learning methods, particularly for real-life applications like this, is highly influential in the field of AI.
9.5 Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization
- Authors: Chenyang Miao, Yunduan Cui, Huiyun Li, Xinyu Wu
- Reason: Their new MARL approach, MACDPP, has demonstrated superiority over several baselines in various control scenarios. This could potentially influence the adoption and further progression of MARL.
9.2 Adapting Double Q-Learning for Continuous Reinforcement Learning
- Authors: Arsenii Kuznetsov
- Reason: Proposes a way to address bias correction in reinforcement learning algorithms, one of the known stumbling blocks in the field. Making these improvements may significantly enhance the practicality of these algorithms.
9.1 Self-Recovery Prompting: Promptable General Purpose Service Robot System with Foundation Models and Self-Recovery
- Authors: Mimo Shirasaka, Tatsuya Matsushima, Soshi Tsunashima, Yuya Ikeda, Aoi Horo, So Ikoma, Chikaha Tsuji, Hikaru Wada, Tsunekazu Omija, Dai Komukai, Yutaka Matsuo Yusuke Iwasawa
- Reason: Presents a novel system addressing a widely recognized need in robotics - a system with high generalizability and adaptability. It also investigates and proposes a solution for three failure types.
8.7 DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation
- Authors: Bao Thach, Tanner Watts, Shing-Hei Ho, Tucker Hermans, Alan Kuntz
- Reason: Offers an innovative solution to a key problem in robotic object manipulation. It presents a neural network that learns goal shapes from human demonstrations. This work could enable advancements in practical robotics applications.