- 9.9 AlphaZero Gomoku
- Authors: Wen Liang, Chao Yu, Brian Whiteaker, Inyoung Huh, Hua Shao, Youzhi Liang
- Reason: This paper offers an innovative application of the AlphaZero technique to Gomoku, an ancient tactical board game, demonstrating AlphaZero’s adaptability to games beyond Go. The research merges deep learning techniques with a Monte Carlo tree search, strengthening its impact on decision processes in complex scenarios such as board games.
- 9.5 Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance
- Authors: Qisen Yang, Shenzhi Wang, Qihang Zhang, Gao Huang, Shiji Song
- Reason: The research presents an analytical approach to reinforcement learning, introducing a novel plug-in method named Guided Offline RL (GORL). The significant practicality of the approach is backed by its adaptability across numerous offline RL algorithms, with substantial performance improvements.
- 9.2 An Ensemble Method of Deep Reinforcement Learning for Automated Cryptocurrency Trading
- Authors: Shuyang Wang, Diego Klabjan
- Reason: The paper proposes an ensemble method to enhance the performance of trading strategies trained by deep reinforcement learning algorithms in the highly stochastic environment of intraday cryptocurrency trading. The document includes steps to tackle the non-stationarity of financial data — a novel approach that may significantly impact the finance and trading industry.
- 9.2 Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning
- Authors: Qisen Yang, Huanqian Wang, Mukun Tong, Wenjie Shi, Gao Huang, Shiji Song
- Reason: This paper proposes an innovative framework for interpretability in reinforcement learning. The authors’ focus on rewards as an essential principle for interpreting RL agents promises to enhance the understanding and application of RL models.
- 9.1 Physics Informed Reinforcement Learning: Review and Open Problems
- Authors: Chayan Banerjee, Kien Nguyen, Clinton Fookes, Maziar Raissi
- Reason: The paper discusses a novel approach, PIRL, in the reinforcement learning field, suggesting new perspectives and gaps for future researches. The authors’ experience and unique taxonomy provide a crucial source for both new researchers and advanced investigations in the field.
- 9.0 Learning-Aware Safety for Interactive Autonomy
- Authors: Haimin Hu, Zixu Zhang, Kensuke Nakamura, Andrea Bajcsy, Jaime F. Fisac
- Reason: Important applications in autonomous systems’ safety have been demonstrated in the paper. The paper contributes to reducing the gap between robotic systems’ theoretical safety analysis and practical learning dynamics, instrumental for real-world deployment of artificial intelligence.
- 9.0 RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking
- Authors: Homanga Bharadhwaj, Jay Vakil, Mohit Sharma, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar
- Reason: Utilises semantic augmentations and action representations to build a framework for multi-task manipulation skills, solving the problem in paucity of robotics datasets. Demonstration of its usability in real-world experiments provides the validity of this research.
- 8.9 Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network
- Authors: Peer Nagy, Sascha Frey, Silvia Sapora, Kang Li, Anisoara Calinescu, Stefan Zohren, Jakob Foerster
- Reason: This paper tackles the significant problem of developing a generative model of realistic order flow in financial markets. The high-frequency reinforcement learning applications approach they suggest could result in the creation of new applications that go beyond forecasting, likely influencing future research.
- 8.9 Explaining grokking through circuit efficiency
- Authors: Vikrant Varma, Rohin Shah, Zachary Kenton, János Kramár, Ramana Kumar
- Reason: The paper gives valuable insights into grokking, a vital aspect of the neural network’s generalization process. By confirming four novel predictions about grokking, authors have significantly added to the understanding of this peculiar phenomenon.
- 8.8 Hawkeye: Change-targeted Testing for Android Apps based on Deep Reinforcement Learning
- Authors: Chao Peng, Zhengwei Lv, Jiarong Fu, Jiayuan Liang, Zhao Zhang, Ajitha Rajan, Ping Yang
- Reason: It introduces an innovative application of reinforcement learning to improve mobile app testing, which is a popular and important technology in today’s digital age. However, the reinforcement learning field is broad, and further publications might overtake its influence.
- 8.7 Neurosymbolic Reinforcement Learning and Planning: A Survey
- Authors: K. Acharya, W. Raza, C. M. J. M. Dourado Jr, A. Velasquez, H. Song
- Reason: As a literature survey of the emerging field of Neurosymbolic Reinforcement Learning, this paper analyzes the constituent elements of this approach and categorizes different works based on their applications, providing a useful overview of the field.
- 8.7 Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach
- Authors: Vimal K B, Saketh Bachu, Tanmay Garg, Niveditha Lakshmi Narasimhan, Raghavan Konuru, Vineeth N Balasubramanian
- Reason: The paper makes a Useful contribution to estimate the transferability of ensemble models for downstream tasks. By benchmarking against state-of-the-art metrics, the authors have demonstrated the significant improvements in transfer learning tasks.
- 8.6 Efficient RL via Disentangled Environment and Agent Representations
- Authors: Kevin Gmelin, Shikhar Bahl, Russell Mendonca, Deepak Pathak
- Reason: The paper proposes a novel method for learning structured representations for Reinforcement Learning algorithms. Proven to outperform other state-of-the-art model-free approaches, it brings significant value to the RL community.
- 8.5 Stabilize to Act: Learning to Coordinate for Bimanual Manipulation
- Authors: Jennifer Grannen, Yilin Wu, Brandon Vu, Dorsa Sadigh
- Reason: With a focus on bimanual robotic systems, this paper leverages the idea of role assignment to simplify the environment and facilitate task accomplishment. The success achieved across tested tasks holds promising implications for the field of robotics, particularly in the realm of more complex tasks.
- 8.1 Multimodal Contrastive Learning with Hard Negative Sampling for Human Activity Recognition
- Authors: Hyeongju Choi, Apoorva Beedu, Irfan Essa
- Reason: This paper proposes an innovative approach to hard negative sampling for multimodal Human Activity Recognition. The methodology demonstrates high robustness in learning strong feature representation for HAR tasks and outperforms other methods on benchmark datasets, suggesting potential future application and influence in this area.