Stanford CS25: V4 I Hyung Won Chung of OpenAI

Hyung Won Chung, a research scientist from OpenAI, delivers a lecture on Transformers, focusing on the dominant driving forces behind AI research and how understanding these forces can help predict future trajectories in the field. He argues that exponentially cheaper compute and associated scaling are the primary drivers, referencing Rich Sutton's plot on computational power. Chung discusses the "Bitter Lesson," emphasizing the importance of general methods with weaker modeling assumptions and scaling up with more data and compute. He analyzes the structures of encoder-decoder and decoder-only Transformer architectures, highlighting the trade-offs between adding structure for short-term gains and removing it for long-term scalability, and concludes by encouraging the audience to revisit assumptions in their problems to shape the future of AI effectively.

Outlines

Sign in to continue reading, translating and more.

Continue

Stanford Online

Introduction to Predicting the Future of AI Research

The Dominant Driving Force in AI: Exponentially Cheaper Compute

The Bitter Lesson and the Importance of Removing Structure

Transformer Architectures: Encoder-Decoder, Encoder-Only, and Decoder-Only

Comparing Encoder-Decoder and Decoder-Only Architectures

Conclusion: Revisit Assumptions and Shape the Future of AI

Stanford CS25: V4 I Hyung Won Chung of OpenAI

Stanford Online

00:04Introduction to Predicting the Future of AI Research

Introduction to Predicting the Future of AI Research

04:16The Dominant Driving Force in AI: Exponentially Cheaper Compute

The Dominant Driving Force in AI: Exponentially Cheaper Compute

11:24The Bitter Lesson and the Importance of Removing Structure

The Bitter Lesson and the Importance of Removing Structure

15:18Transformer Architectures: Encoder-Decoder, Encoder-Only, and Decoder-Only

Transformer Architectures: Encoder-Decoder, Encoder-Only, and Decoder-Only

23:29Comparing Encoder-Decoder and Decoder-Only Architectures

Comparing Encoder-Decoder and Decoder-Only Architectures

35:14Conclusion: Revisit Assumptions and Shape the Future of AI

Conclusion: Revisit Assumptions and Shape the Future of AI