1. 9.3 Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction
  2. 8.8 Reinforcement Learning for SAR View Angle Inversion with Differentiable SAR Renderer
  3. 8.5 Data Assimilation in Chaotic Systems Using Deep Reinforcement Learning
  4. 8.2 Towards Model-Free LQR Control over Rate-Limited Channels
  5. 7.9 Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models