9.1 Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
- Authors: Yevgen Chebotar, Quan Vuong, Alex Irpan, Karol Hausman, Fei Xia, Yao Lu, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Sontakke, Grecia Salazar, Huong T Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singht, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine
- Reason: This paper, which includes numerous researchers and professionals experienced in the field, presents a reinforcement learning method for training multi-task policies from large datasets. The authors claim this method significantly outperforms previous techniques, making it a potential influential paper for the field.
9.0 Differentiable Quantum Architecture Search for Quantum Reinforcement Learning
- Authors: Yize Sun, Yunpu Ma, Volker Tresp
- Reason: This paper brings together two cutting-edge fields - quantum computing and reinforcement learning. The authors introduce a novel framework for quantum deep Q-learning problems, bringing significant improvement over manually designed circuits. This promising combination could potentially greatly impact the field.
8.9 Task Graph offloading via Deep Reinforcement Learning in Mobile Edge Computing
- Authors: Jiagang Liu, Yun Mi, Xinyu Zhang
- Reason: This paper tackles the problem of task graph offloading in mobile edge computing. By adapting to environmental changes and designing a deep reinforcement learning algorithm to improve user experience, this paper addresses a significant problem in the current dynamic environment of mobile edge computing.
8.8 Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms
- Authors: Keru Wu, Yuansi Chen, Wooseok Ha, Bin Yu
- Reason: This paper tackles the problem of domain adaptation (DA), which arises when the initial training data differs from the evaluation data. With their innovative incorporation of conditionally invariant components (CICs), the authors offer improvements in DA algorithm target performance, ensuring away for more effective learning models.
8.7 Guide Your Agent with Adaptive Multimodal Rewards
- Authors: Changyeon Kim, Younggyo Seo, Hao Liu, Lisa Lee, Jinwoo Shin, Honglak Lee, Kimin Lee
- Reason: In a field dominated by goal-oriented learning models, this unique approach of training an agent using multimodal rewards based on their interaction with the environment could be influential. The authors claim that adaptive return-conditioned policy (ARP) can produce superior generalization performance compared to traditional text-conditioned policies.