[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton | Latent Space: The AI Engineer Podcast

In this episode of Latent Space, the host interviews Kevin Wang and his team (Benjamin Eysenbach, and Ishaan) from Princeton about their NeurIPS best paper award-winning project on scaling deep reinforcement learning (RL). Kevin discusses the motivation behind exploring deeper networks in RL, drawing parallels to the success of large models in NLP and vision. The team explains their approach of using self-supervised RL with architectural innovations like residual connections and layer normalization to achieve significant performance gains. They also touch on the intersection of reinforcement learning and self-supervised learning, the potential impact on robotics, and the trade-offs involved in scaling depth versus width in neural networks. The team also discusses future research directions, including stitching in reinforcement learning, scaling up depth, width and batch size, and vision language action models.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise

[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton

Latent Space: The AI Engineer Podcast

Introduction to Deep Reinforcement Learning Research

Overcoming Scalability Challenges in Deep RL

The Scalable Objective of the RL Algorithm

Efficiency and Data Considerations for Deep RL

Analogies to Language Models and Future Directions

Future Research and Concluding Remarks

[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton

Latent Space: The AI Engineer Podcast

00:02Introduction to Deep Reinforcement Learning Research

Introduction to Deep Reinforcement Learning Research

04:29Overcoming Scalability Challenges in Deep RL

Overcoming Scalability Challenges in Deep RL

09:16The Scalable Objective of the RL Algorithm

The Scalable Objective of the RL Algorithm

13:10Efficiency and Data Considerations for Deep RL

Efficiency and Data Considerations for Deep RL

17:22Analogies to Language Models and Future Directions

Analogies to Language Models and Future Directions

22:00Future Research and Concluding Remarks

Future Research and Concluding Remarks