26 Feb 2026
52m

[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

Podcast cover

Latent Space: The AI Engineer Podcast

Summary

The podcast explores the ethics and practicalities of "distillation attacks" on large language models (LLMs), where smaller models are trained on the outputs of larger, proprietary models. The discussion covers the challenges of detecting such attacks versus legitimate evaluation, noting that scale and pattern analysis are key detection methods. The participants debate whether companies should restrict model access via APIs to prevent distillation, with some arguing for product-exclusive models. The conversation shifts to the saturation and inherent flaws of coding benchmarks like SWE-Bench, including the discovery of unsolvable tasks and models memorizing solutions. They highlight the need for updated, private benchmarks and discuss the surprising capacity of LLMs to memorize data from a single pass, underscoring the understudied information theory of LLMs.

Outlines

Part 1: Introduction, Distillation Basics

00:00

Introduction to SAIL Live and Distillation in Machine Learning

Nathan Lambert introduces Swyx as the latest writer joining the SAIL coalition. Sebastian Raschka expresses his pleasure in having Swyx on the podcast and mentions listening to his podcast about the SWE-Benchmark. The discussion then transitions to the topic of distillation, with Sebastian providing a definition: taking a larger model, letting it generate outputs, and training a smaller model on those outputs. This is a common practice for creating smaller model variants that can run locally. The question arises of what happens when a company trains its own model using synthetic data from another company's LLM.

05:04

Detecting Distillation Attacks and Privacy Concerns with LLM Usage

The conversation explores the possibility of distillation even at the frontier, such as distilling from Claude Opus to build Claude Sonnet. The discussion shifts to the terms of service of large labs, which state that outputs from their APIs cannot be used to train competitive AI models. The speakers discuss how to detect a distillation attack versus a regular evaluation, noting that it's difficult to distinguish between the two since the process is the same. The scale of data usage and patterns across similar accounts are key factors. Concerns are raised about privacy, as companies may be checking what users generate with LLMs.

Part 2: Industry Analysis, API Business Models

12:08

Analyzing Anthropic's Distillation Claims and the Timing of Data Collection

The discussion shifts to Anthropic blocking US companies, including OpenAI and XAI, from using their models, with XAI possibly accused of distilling. The speakers analyze Anthropic's blog post and the data presented, questioning why DeepSeek used significantly less data than Minimax. It's suggested that Anthropic may be using the DeepSeek name strategically. The timing of data collection is crucial, as Minimax was found distilling during the training of Minimax 2.5. When Minimax was distilling, Anthropic released Opus 4.6 and redirected nearly half their traffic.

17:10

The Complexities of Distillation Data and API Business Models

The discussion explores the idea that the strongest model is not necessarily the best teacher, and that matching the probabilities of tokens to the base model is important. Quen dense models are often the best teachers for open weight models. The speakers discuss the challenges of scaling pipelines to use larger models and the difficulty of making the numbers go up. The conversation shifts to API business models, with Nathan suggesting that Anthropic gives off Apple vibes and might restrict models to products rather than APIs.

25:32

API Business Models, Codex, and Future Discussion Topics

The speakers debate the viability of API business models, with Sebastian arguing that the API customer base is significant, especially for products like customer chatbots. He believes the API is a good business model if tokens are sold at a non-subsidized price. Swyx notes that OpenAI's last three GPT-5s had Codex variants released inside of Codex rather than as an API, indicating a shift towards private models for products. The group then considers future discussion topics.

Part 3: SWE-Bench, Code Benchmarking

28:41

Defining SWE-Bench and the Challenges of Code Benchmarking

The discussion transitions to SWE-Bench, a coding benchmark used to compare the capabilities of LLMs. SWE-Bench is a paper out of Princeton that draws thousands of example open source issues and PRs. OpenAI adopted SWE-Bench but curated a subset of 500 high-quality tasks. The speakers discuss the structure of SWE-Bench, where the task is to fix bugs in code. Swyx notes that OpenAI invested a lot of money and effort into making SWE-Bench Verified. The speakers discuss the saturation of SWE-Bench scores and the inherent noise in running these models.

37:12

Unintentional Cheating and the Information Theory of LLMs

The speakers discuss how models unintentionally cheat on benchmarks, such as including information from the future due to being trained on GitHub data. They highlight the ethical challenges of releasing full datasets in public. Sebastian finds it fascinating that models can memorize things with only one pass through the data. Swyx brings up the information theory of LLMs, questioning how models can memorize from one pass and how superposition works.

42:48

SWE-Bench Pro, Frontier Evals, and the Future of Benchmarking

The speakers discuss SWE-Bench Pro, which aims to fix the issues with SWE-Bench Verified. They note that SWE-Bench Pro has a limited budget compared to SWE-Bench Verified. The conversation shifts to frontier evals, which are even more expensive and will cost tens or hundreds of millions of dollars. Sebastian notes that coding is easier to evaluate than other domains. The speakers discuss the importance of human connection and expertise in the age of AI-generated content. Nathan concludes the podcast due to an upcoming meeting.

Sign in to continue reading, translating and more.

Open full episode in Podwise