The podcast explores Cursor's Cloud Agents, focusing on how they enhance software development workflows through automation and collaboration. A key innovation is the agent's ability to test its own code and provide a video demo of the changes, streamlining the review process. The discussion highlights three pillars: model testing, video demos, and full remote control access to a VM. Examples include automating error message improvements and bug fixes, showcasing the agent's capacity to use DevTools and manage file uploads. The conversation also covers the shift from individual to collaborative development, with Slack becoming a central hub for agent-driven tasks and team discussions. Ultimately, the speakers envision a future where AI agents handle more complex tasks, allowing developers to focus on higher-level design and strategic decisions.
Part 1: The New Paradigm of AI Development
00:00Synergistic AI Output Through Diverse Models and Parallel Agent Swarms
Synergistic AI Output Through Diverse Models and Parallel Agent Swarms
Experiments show synergistic outputs can be achieved by using models from different providers as the base level, which is better than having a unified bottom model tier. The big unlock in the coming months will be widening the pipe by paralyzing more, whether that's swarms of agents or parallel agents, to get much more done in the same amount of time.
00:53Cursor's Cloud Agents: Giving Models Full Computer Access for Enhanced Development
Cursor's Cloud Agents: Giving Models Full Computer Access for Enhanced Development
Cursor's biggest launch, Cloud Agents, gives Cursor a computer. Cloud agents already ran in their own computers, but they were sight reading code. Giving the model the tools to onboard itself and then use full computer use end to end pixels in coordinates out and have a cloud computer with different apps in it is a big unlock. The agent will test its changes, taking time to write the tokens of code and test them end to end.
02:56Pillars of Cursor's Cloud Agents: Testing, Video Demos, and Remote Access
Pillars of Cursor's Cloud Agents: Testing, Video Demos, and Remote Access
The model tests its changes and comes back with a video of what it did. Reviewing a video is not a substitute for reviewing code, but it is an entry point that is much easier to start with than glancing at some giant diff. Users also have full remote control access to the VM. Sometimes it will build storybook type galleries where you can see the component in action.
04:25Benefits of Video Demos: Alignment and Shared Artifacts in Agent Communication
Benefits of Video Demos: Alignment and Shared Artifacts in Agent Communication
The videos have been super helpful, especially in cases where a common problem was under-specification in requests. Having the video up front just makes that alignment very clear. It's like you're talking about a shared artifact with the agent.
Part 2: Agent Capabilities and Technical Implementation
06:07Cloud Agents for Backend Changes: Error Message Improvement and DevTools Usage
Cloud Agents for Backend Changes: Error Message Improvement and DevTools Usage
Cloud Agents can be used for front-end and back-end changes. One example is implementing a better error message for saving secrets. The agent opened DevTools, wrote JS to paste 5,000 characters, hit save, and got the new error message.
07:51AGI-Pilled Approach: Giving Models Pixels and Removing Limitations for Intelligence
AGI-Pilled Approach: Giving Models Pixels and Removing Limitations for Intelligence
The approach is to give the model pixels and a brain in a box, removing limitations around context and capabilities such that the bottleneck should be the intelligence. Giving it its full VM and having it be onboarded with DevEx set up like a human would is a really big step change in capability.
08:51Model Autonomy and Bug Reproduction: Enhancements in Cursor's Cloud Agents
Model Autonomy and Bug Reproduction: Enhancements in Cursor's Cloud Agents
Opus 4.5, 4.6 and Codex 5.3 were additional step changes in the autonomy grade capabilities of the model to just go off and figure out the details and come back when it's done. For bugs in particular, the model having full access to its own VM, it can first reproduce the bug, make a video of the bug reproducing, fix the bug, make a video of the bug being fixed.
11:33Bug Fixing Demo: Cloud Agent Workflow and the Slash Repro Command
Bug Fixing Demo: Cloud Agent Workflow and the Slash Repro Command
A bug on cursor.com slash agents, where if you would attach images, remove them, and then still submit your prompt, they would actually still get attached to the prompt. The agent attaches images, removes some of them, hit send, and only one of the images is left in the attachment. There is also a slash repro command where you can just do fix this bug slash repro.
13:35Slash Commands: Enhancing Cloud Agent Functionality and Debugging
Slash Commands: Enhancing Cloud Agent Functionality and Debugging
There are many slash commands, including fix bugbot, no test, and repro. Cloud Agent Diagnosis makes heavy use of the Datadog MCP. If there is a problem with a cloud agent, it will spin up a bunch of sub-agents using the Datadog MCP to explore the logs and find all of the problems that could have happened with that.
15:26Agent Transcripts and Datadog MCP: Debugging and Self-Healing Software
Agent Transcripts and Datadog MCP: Debugging and Self-Healing Software
You can spit up an agent and give it access to another agent's transcript to either debug something that happened or continue the conversation, almost like forking it. The transcript includes all the chain of thought. Datadog wants to own the self-healing software space.
Part 3: Evolution of the Developer Workflow
17:36The Transition from Tab to Agents: A Rapidly Accelerating Shift in Coding
The Transition from Tab to Agents: A Rapidly Accelerating Shift in Coding
Cursor launched Cloud Agents in June last year. There's been a slowly evolving thing. Agents are overtaking tab. The models weren't even good enough to do any of this stuff a year ago. The shift from tab and autocomplete is accelerating. It goes from agents handing you back diffs and you're like in the weeds and giving it 30 second to three minute tasks to you're giving it three minute to 30 minute to three hour tasks.
19:44Collaborative Development: Slack as the New IDE and Team Follow-Ups
Collaborative Development: Slack as the New IDE and Team Follow-Ups
There has been a shift from primarily individually driven development to almost this collaborative nature of development. Slack is almost like a development on IDE. People are always at cursoring and that kicks off a cloud agent. If Jonas kicks off at cursor in a thread, others can follow up with it and add more context. Cursor can tag other people who are not involved in pharmaceutics.
21:15Human Collaboration and Production Bottlenecks: Scaling Compute for Software Pipelines
Human Collaboration and Production Bottlenecks: Scaling Compute for Software Pipelines
The work that is left that the humans are discussing in these threads is the nugget of what is actually interesting and relevant. It's not the boring details of where does this if statement go. It's do we want to ship this? It's so easy to get to I have a PR for that, but it's hard still relatively to get from I have a PR for that to I'm confident and ready to merge this.
23:19AI Review and BugBot: Enhancing Code Quality and Confidence
AI Review and BugBot: Enhancing Code Quality and Confidence
There is a debate on whether AI is needed to review AI. The video is often like alignment and then there is a code review process. BugBot has been a great, really highly adopted internally. There's the code level review where it's looking at the actual code and then there's the feature level review where you're looking at the features.
25:09Scaling DevEx: Democratization of Tools and Team Configuration
Scaling DevEx: Democratization of Tools and Team Configuration
As cloud agents scale up parallelism and how much code you generate, 10 person startups become need the DevEx and pipelines that a 10,000 person company used to need. Cursor is making it really great for teams and making it the 10th person that starts using Cursor in a team is immediately set up. Other people can configure what MCPs and skills like plugins.
Part 4: Infrastructure and User Experience
27:34Cloud Agent Adoption and VM Size Selection: Brain in a Box
Cloud Agent Adoption and VM Size Selection: Brain in a Box
For cloud agents, there has been a lot of adoption with smaller teams where the code bases are not quite as complex to set up. Users cannot yet choose the size of the VM, but that is planned. Cursor wants to be a brain in a box. The desktop should be something you can return to even after some days.
29:11VM Persistence and Snapshotting: Balancing Statefulness and Scalability
VM Persistence and Snapshotting: Balancing Statefulness and Scalability
The goal is to be able to log in with credentials to the thing, but not actually store it in any secret store or whatever. There's a Dockerfile based approach. The main default way is actually snapshotting. You run a bunch of install commands and then you snapshot more or less the file system.
30:55Unshipped Features: Native Browser and Files App in Cursor Web
Unshipped Features: Native Browser and Files App in Cursor Web
There was a native browser that you would have locally. It was basically an iframe that via port forwarding could load the URL, could talk to localhost in the VM, but it was unshipped. The remote desktop was sufficiently low latency and more general purpose.
32:26Files App Removal: Encouraging Delegation and New Interaction Patterns
Files App Removal: Encouraging Delegation and New Interaction Patterns
There used to be the ability to see and edit files, but it was removed. By restricting and limiting what you could do there, people would naturally leave more to the agent and fall into this new pattern of delegating, which was thought to be really valuable.
34:01DevTools and Cloud Agents: Full Stack Cursor and Debugging
DevTools and Cloud Agents: Full Stack Cursor and Debugging
Cursor starts from the DevTools and works their way towards Cloud Agents. There is a question of whether there is a future where there's like full stack cursor, where like cursorapps.com, where like I host my cursor site, which is basically a Vercel clone.
Part 5: Model Strategy and Agent Architecture
35:52Model Selection and Defaults: Balancing Choice and Expertise
Model Selection and Defaults: Balancing Choice and Expertise
The model selector is stuck down bottom left. There is a desire to give people a choice across models. Cursor does a very good job of exposing the model being used and how to switch if you want. There is a desire to be doing more with defaults where things can be suggested to people.
37:11Agent Lab vs. Model Lab: Routers and Parallel Agents
Agent Lab vs. Model Lab: Routers and Parallel Agents
Cursor is an example of an agent lab that is building a new playbook that is different from a model lab. Every agent lab is going to have a router. Best event is a subset of parallel agents where they're running on the same prompt.
39:04Multi-Model Choice and Synergistic Output: The Council
Multi-Model Choice and Synergistic Output: The Council
In the dropdown picker, multiple models can be selected. There was an interesting learning that's relevant for these different model providers. It was something that would run a bunch of best of ends, but then synthesize and basically run like a synthesizer layer of models. There could be some benefit from having like multiple top tier models involved in like a model swarm.
41:40Sub-Agents: Collaboration and Delegation
Sub-Agents: Collaboration and Delegation
Sub-agents are another way to get agents of the different prompts and different goals and different models, different vintages to work together and collaborate and delegate.
42:33Sub-Agent Functionality: Context Management and Task Interface
Sub-Agent Functionality: Context Management and Task Interface
Sub-agents are great for context management for kind of long running threads, or if you're trying to just throw more compute at something. There is a generic task interface where then the main agent can define what goes into the sub-agent.
Part 6: Scaling Intelligence and Throughput
44:47Long Running Agents and Throughput: Building a Browser with a Society of Workers
Long Running Agents and Throughput: Building a Browser with a Society of Workers
Some of the experiments have found their way into a feature that's available in cloud agents now, the long running agent mode. The Ralph Wiggum loop was floating around at the time, but it was something also independently found and he was experimenting with. What built the browser is a society of workers and planners and different agents collaborating.
47:34Throughput and Inference: The Scale of Systems Producing Code
Throughput and Inference: The Scale of Systems Producing Code
Throughput is a really big thing where if you see this system of a hundred concurrent agents outputting thousands of tokens a second, you can't go back. The amount of inference that will be needed per developer is mind boggling.
49:24Token Consumption and Leverage: The Value of AI Tools
Token Consumption and Leverage: The Value of AI Tools
There are no worries about developers losing their jobs, at least in the near term. There's so much stuff to be built. As we think about these highly parallel kind of agents running off for a long time in their own VM system, people will be spending thousands of dollars a month per human.
51:03Hiring and Agentic Engineering: Fundamentals and Decision-Making
Hiring and Agentic Engineering: Fundamentals and Decision-Making
Being great at the latest thing with AI coding is not necessarily a prerequisite. The fundamentals remain important in the current age and being able to go and double click down and models today do still have weaknesses where if you let them run for too long without cleaning up and refactoring, the code will get sloppy and there'll be bad abstractions.
53:04Context Switching and Parallelization: A New Way of Working
Context Switching and Parallelization: A New Way of Working
The ability to hop back and forth between threads really quickly is enabled by these new interfaces and this parallelism. By having different desktops where you can hop back and forth, you're not like Oh, I checked out this branch.
Part 7: Future Outlook and Self-Optimizing Systems
55:20Coding Tools and Productivity: The Future of Human Interaction
Coding Tools and Productivity: The Future of Human Interaction
The coding tools start coming into conflict with the productivity tools where like the linear, the Kanban boards. OpenCloud is extremely mind-expanding in terms of what can happen.
57:07Industry Shifts and Predictions: The Future of Agents
Industry Shifts and Predictions: The Future of Agents
Other industries are going to start to go through what software development has started going through. Agents are going to keep getting better, going to stop doing as much manual coding.
58:46Voice Coding and Siri Voice: The Future of AI Interaction
Voice Coding and Siri Voice: The Future of AI Interaction
Voice coding is always considered like the hardest part because you have to say like technical things that are spelled, like spelling matters, the capitalization matters and like it's all about a voice. People would use their Siri voice where they would start talking in like short stilted sentences and enunciate really clearly.
1:00:07Cloud Agents vs. Local Agents: The Crossover and Hard Things
Cloud Agents vs. Local Agents: The Crossover and Hard Things
It will take longer than people think and longer than we think for cloud and agents working in their own boxes to surpass local agents. Getting those sandboxes to be really good is hard. Having these agents run in the cloud and be more autonomous, there is a lack of memory.
1:02:30Dynamic File Context and Self-Auditability: Promising Memory Solutions
Dynamic File Context and Self-Auditability: Promising Memory Solutions
The dynamic file system stuff is probably very promising for memory. There is also this notion of needing to have the agent be a little bit more self-aware in terms of being able to identify gaps in its own functionality and decide how to fill them.
1:04:18Self-Awareness and System Prompts: Optimizing Agents
Self-Awareness and System Prompts: Optimizing Agents
When the model starts editing its own system prompt, what does that even mean? All of this self-awareness is not like the model itself having a notion of consciousness, but more like knowing what system it's operating in and the constraints of that system and potentially being able to have agency in optimizing itself to operate best in the system.
Sign in to continue reading, translating and more.
Open full episode in Podwise