9.5 Federated Fine-tuning of Billion-Sized Language Models across Mobile Devices
- Authors: Mengwei Xu, Yaozong Wu, Dongqi Cai, Xiang Li, Shangguang Wang
- Reason: This paper presents a novel method for federated learning and fine-tuning of large language models on mobile devices, solving significant problems such as memory consumption and slow convergence.
9.3 Universal Graph Continual Learning
- Authors: Thanh Duc Hoang, Do Viet Tung, Duy-Hung Nguyen, Bao-Sinh Nguyen, Huy Hoang Nguyen, Hung Le
- Reason: This study introduces a new approach to deal with catastrophic forgetting issues in graph learning, applicable to a wide variety of tasks. It uses a novel rehearsal mechanism for knowledge preservation, presenting significant improvement over conventional methods.
9.1 Go Beyond Imagination: Maximizing Episodic Reachability with World Models
- Authors: Yao Fu, Run Peng, Honglak Lee
- Reason: The paper introduces a new design of intrinsic reward for reinforcement learning, offering significant improvement on challenging navigation tasks, and in terms of sample efficiency for locomotion tasks.
9.1 Traffic Light Control with Reinforcement Learning
- Authors: Taoyu Pan
- This paper introduces a novel use of deep Q learning for real-time traffic light management. The proposed approach shows significant improvements in reducing vehicle waiting time, queue lengths, and total travel time, suggesting important advancements in road traffic control. Given the commonplace and global nature of traffic congestion, the potential positive impacts of this research are broad.
9.0 Policy Diversity for Cooperative Agents
- Authors: Mingxi Tan, Andong Tian, Ludovic Denoyer
- This work proposes a sophisticated method for promoting policy diversity in cooperative multi-agent reinforcement learning. With its capacity to generate diverse team policies, it could significantly enhance the efficiency of collaboration between artificial agents, which is crucial in a wide range of applications.
8.9 Large Language Models in Analyzing Crash Narratives – A Comparative Study of ChatGPT, BARD and GPT-4
- Authors: Maroa Mumtarin, Md Samiullah Chowdhury, Jonathan Wood
- Reason: This paper investigates the use of large language models for text analysis in traffic safety research, exploring their boundaries and usefulness, leading to valuable safety advances.
8.9 Reinforcement Learning for Generative AI: A Survey
- Authors: Yuanjiang Cao, Lina Yao, Julian McAuley, Quan Z. Sheng
- This comprehensive survey paper provides an in-depth review of the use of reinforcement learning techniques within generative AI, bridging a multitude of application areas. It thus appears as a key reference for understanding the current state and potential future directions of this important intersection between machine learning disciplines.
8.7 Shielded Reinforcement Learning for Hybrid Systems
- Authors: Asger Horn Brorholt, Peter Gjøl Jensen, Kim Guldstrand Larsen, Florian Lorber, Christian Schilling
- The paper channelizes an innovative strategy, using a ‘shield’ for establishing safety in complex hybrid systems. The methodology could have vital implications for a variety of challenging applications where safety is a critical concern.
8.6 Optimal Transport-inspired Deep Learning Framework for Slow-Decaying Problems: Exploiting Sinkhorn Loss and Wasserstein Kernel
- Authors: Moaad Khamlich, Federico Pichi, Gianluigi Rozza
- Reason: This study uses optimal transport theory and neural networks to improve reduced order models in scientific computing, showcasing its effectiveness through improved accuracy and computational efficiency.
8.6 Recent Progress in Energy Management of Connected Hybrid Electric Vehicles Using Reinforcement Learning
- Authors: Min Hua, Bin Shuai, Quan Zhou, Jinhai Wang, Yinglong He, Hongming Xu
- This review offers essential insights into the emerging field of energy management for hybrid electric vehicles, emphasizing recent developments in applying reinforcement learning techniques. Given the growing interest in electric vehicles due to growing environmental concerns, this work could inform the development of more efficient energy management systems.