Lovable, which is a Vibe coding device, says Claude 4 has diminished its errors by 25% and made it sooner by 40%.
On Might 22, Anthropic began rolling out two new fashions: Claude Sonnet 4 and Claude Opus 4. Whereas Sonnet is out there without spending a dime customers, Opus requires a paid subscription and is ready to do higher than Sonnet in terms of coding.
In a weblog put up, Anthropic confirmed that Claude Opus 4 scored 72.5 p.c in SWE-bench (SWE is brief for Software program Engineering Benchmark).
Within the assessments, Opus 4 delivered sustained efficiency on long-running duties that require targeted effort and hundreds of steps.
Anthropic additionally claimed that its latest mannequin labored on the code for seven hours straight.
Vibe coding firm Lovable, which makes use of Claude in its “AI-powered prompt-based net and apps builder” device, has noticed related enhancements after upgrading to Claude 4.
In a put up on X, Lovable says it has 25% much less errors and be 40% sooner general after deploying Claude 4 for each undertaking creation and edits on all initiatives (together with outdated initiatives).

In a separate put up, Lovable founder Anton Osika confirmed that “Claude 4 simply erased most of Lovable’s errors” whereas particularly referring to LLM syntax errors when vibe coding.
Claude 4 is an efficient mannequin for coding
Whereas opinion on Claude 4 stays combined, I’ve personally observed that Claude 4 does produce code with fewer errors than Gemini after I’m engaged on Dart/Kotlin apps.
This will depend on undertaking to undertaking and likewise context, however in initiatives the place an extended context just isn’t required, Claude 4 did higher than Gemini in my assessments.
Claude fashions have all the time maintained the fame of “greatest at coding,” however there was steep competitors from Google currently, which launched Gemini 2.5 Professional with a 1 million context window.
In comparison with the 200,000 context window of Claude 4 or older fashions, the 1 million context window for Gemini 2.5 does give it a bonus. But it surely does not essentially imply Gemini 2.5 is best than Claude 4 in coding.
Each may be surprisingly sensible and likewise horrible on the identical time, and it additionally comes right down to the way you do immediate engineering.
It is all the time good to combine the fashions, reminiscent of o3 or Gemini for planning and Claude 4 and Gemini for coding.