Notion’s Token Town: 5 Rebuilds, 100+ Tools, MCP vs CLIs and the Software Factory Future — Simon Last & Sarah Sachs of Notion
Latent Space: The AI Engineer Podcast
Notion’s approach to building AI agents centers on a "software factory" model that prioritizes iterative prototyping, developer velocity, and deep integration with the company's existing data primitives. By treating agents as autonomous systems capable of self-verification and debugging, the team has moved away from rigid, few-shot prompting toward goal-oriented tool calling. This evolution relies on a robust evaluation framework where model behavior engineers—rather than just software developers—triage failures and refine agent performance. The platform’s strategy emphasizes "progressive disclosure" to manage tool complexity, ensuring that agents remain lightweight and permission-focused. Ultimately, Notion positions itself as the central system of record for enterprise work, where agents automate tedious bookkeeping and data capture, allowing human teams to focus on high-level collaboration and problem-solving rather than manual process management.
00:00Evolution of Agentic Workflows and Tool-Calling Infrastructure
Evolution of Agentic Workflows and Tool-Calling Infrastructure
Notion’s journey into agentic capabilities began in 2022, initially struggling with short context windows and limited model reasoning. The transition from fine-tuning custom tool-calling frameworks to leveraging advanced reasoning models like Claude 3.5 Sonnet marked a significant turning point. The current strategy focuses on building robust infrastructure that balances "AGI-pilled" long-term vision with shipping immediate, high-utility features. Notion positions itself as an expert in collaboration, drawing an analogy to Datadog’s relationship with AWS: while the underlying cloud infrastructure is essential, the value lies in the specialized, user-centric layer built on top.
07:53Scaling Engineering Culture and Distributed Agent Development
Scaling Engineering Culture and Distributed Agent Development
Notion’s engineering culture prioritizes "demos over memos" and maintains a high degree of flexibility, where management boundaries are loose to allow engineers to swarm on high-priority projects. The team structure is horizontal, with product engineering teams responsible for making their services work for both human and agent users. This approach treats agents as first-class citizens, anticipating a future where the majority of product traffic originates from automated agents rather than human interaction. Internal hackathons and a "Simon Vortex" project structure facilitate rapid prototyping and cross-team collaboration.
16:43Advanced Evaluation Frameworks and Model Behavior Engineering
Advanced Evaluation Frameworks and Model Behavior Engineering
Maintaining high-quality agent performance requires a sophisticated evaluation system that goes beyond simple unit tests. Notion employs "Model Behavior Engineers" (MBEs) to bridge the gap between data science, product management, and prompt engineering. These engineers maintain "headroom evals"—benchmarks where the target pass rate is intentionally set low (around 30%) to track progress toward frontier capabilities. The goal is to create an end-to-end system where agents can download datasets, run evaluations, debug failures, and implement fixes with minimal human intervention, effectively moving up the abstraction ladder of software engineering.
27:46Designing the Software Factory and Agentic Composability
Designing the Software Factory and Agentic Composability
The "software factory" concept aims to automate the development, debugging, and maintenance of codebases through agentic workflows. Key components include a specification layer using Markdown, a self-verification loop for testing, and a process for handling bugs via sub-agents. Composability is achieved by allowing agents to interact through shared data primitives (databases and pages) or by directly invoking other agents. This architecture enables complex tasks, such as triaging customer feedback or managing office operations, to be handled by a hierarchy of specialized agents, significantly reducing the need for manual human intervention.
36:47Balancing CLI Tools, MCP, and Native Integrations
Balancing CLI Tools, MCP, and Native Integrations
While Model Context Protocol (MCP) provides a standardized, permission-focused way to access tools, CLI-based environments offer superior bootstrapping capabilities, allowing agents to fix their own environment issues. Notion adopts a hybrid approach: using MCP for long-tail integrations while building native, high-performance tools for core services like Slack, Mail, and Calendar. This strategy ensures optimal latency and quality control. The evolution of these tools has moved away from complex, system-specific formats toward simpler, model-friendly standards like SQLite and Markdown, prioritizing what the model needs to function effectively over rigid internal data structures.
47:19Optimizing Model Selection and Pricing for Agentic Tasks
Optimizing Model Selection and Pricing for Agentic Tasks
The shift from few-shot prompting to goal-driven tool definitions has allowed for better distribution of tool ownership across teams. Notion’s pricing model for agents is abstracted into credits to account for varying costs across different serving tiers, GPUs, and model types. The company avoids forcing users into expensive, high-intelligence models for simple tasks, instead providing nudges to help users select the most cost-effective model for their specific needs. The ultimate goal is to fill the "triangle" of intelligence, price, and latency, ensuring that users have access to the right model for every task without unnecessary token waste.
1:03:27Data Capture and the Future of Meeting Intelligence
Data Capture and the Future of Meeting Intelligence
Meeting notes serve as a critical data capture primitive, transforming unstructured conversations into actionable signals for prioritization and performance reviews. By integrating agents into the meeting lifecycle—from pre-read generation to post-meeting task filing—Notion aims to remove the "bookkeeping" burden from human collaboration. Future developments focus on improving retrieval models specifically for agentic queries, which differ significantly from human search patterns. While the team explores partnerships with wearable technology for data capture, the core focus remains on being the best system of record where collaborative work and meeting intelligence reside.
Sign in to continue reading, translating and more.
Open full episode in Podwise