OpenAI’s GPT-5.2 Better Than Claude Opus 4.5 for Long Autonomous Tasks, Says Cursor

Cursor says it has found OpenAI’s GPT-5.2 models to be significantly more reliable than Anthropic’s Claude Opus 4.5 for long-running, autonomous coding tasks.

On the same day, Cursor also made the GPT 5.2 model available on its platform.

This was found when the team set out to build a web browser from scratch using Cursor. CEO Michael Truell said on X that the browser’s rendering engine was built from scratch in Rust, with support for HTML parsing, CSS cascade and layout, text shaping, painting, and a custom JavaScript virtual machine.

“It kind of works,” Truell wrote. “It still has issues and is, of course, very far from WebKit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.”

Cursor has released the code on GitHub.

Watch a timelapse of GPT-5.2 building a browser!
Both websites start out barely working, and then after millions of lines of code, the browser actually works.
Pretty cool experiment with long-running agents. https://t.co/KT6HgHEivA pic.twitter.com/8EPbyih8cU

— Lee Robinson (@leerob) January 14, 2026

In a research blog post published this week, Cursor described the browser as part of a broader effort to test whether autonomous coding agents can scale to projects “that typically take human teams months to complete.”

Cursor stated that while building the browser, “We found that GPT-5.2 models are much better at extended autonomous work: following instructions, keeping focus, avoiding drift, and implementing things precisely and completely.”

By contrast, “Opus 4.5 tends to stop earlier and take shortcuts when convenient, yielding back control quickly,” Cursor said.

Other long-running experiments include a multi-week, in-place migration of Cursor’s own codebase from Solid to React, involving +266,000 and –193,000 lines of changes, a Java Language Server Protocol project with 7,400 commits and 550,000 lines of code, a Windows 7 emulator exceeding 1.2 million lines, and an Excel-like system reaching 1.6 million lines.

In another case, Cursor said a long-running agent rewrote a video-rendering pipeline in Rust, making it “25× faster” while also adding smooth zooming, panning, and motion-blur effects.

The post OpenAI’s GPT-5.2 Better Than Claude Opus 4.5 for Long Autonomous Tasks, Says Cursor appeared first on Analytics India Magazine.