DeepSeek R1 Matches Gemini, Claude in Coding Benchmark Performance

The updated R1 model from DeepSeek achieves top-tier coding performance, tying Gemini-2.5 and Claude Opus 4 in benchmark speed and accuracy.
Chinese AI startup DeepSeek's updated R1 reasoning model tied for first place with Google's Gemini-2.5 and Anthropic's Claude Opus 4 in the WebDev Arena coding competition.
The three models scored nearly identical results in the real-time benchmark test. DeepSeek's R1-0528 update, released in late May on Hugging Face, scored 1,408.84 points. Claude Opus 4 scored 1,405.51 points, while Gemini-2.5 reached 1,433.16 points on the WebDev Arena leaderboard.
The platform ranks large language models based on speed and accuracy when solving coding tasks. Human evaluators assess each model's output quality in the WebDev Arena tests. They award points based on correctness and efficiency of the generated code solutions.
Despite being labeled a “minor upgrade,” the R1-0528 version, released at the end of May, delivered significant enhancements. According to DeepSeek, the update brought “enhanced reasoning performance” and reduced hallucinations – instances of incorrect or misleading information – by approximately 45-50%.
Read also: DeepSeek AI & Crypto: The Next Big Thing in AI-Powered Blockchain?
DeepSeek first introduced its R1 reasoning model in January 2025 positioning it as a cost-effective alternative to resource-intensive offerings from Google and Anthropic. The R1 maintained top-tier benchmark results since launch. DeepSeek achieves these results despite significantly lower training costs compared to larger competitors.
The company uses an open-source approach, allowing developers to download, modify, and contribute to R1’s code on Hugging Face. This strategy drove rapid adoption among developers. This open-source strategy aligns with a broader trend in China, where tech giants like Baidu are increasingly endorsing open development practices over closed, proprietary models.
DeepSeek is developing a next-generation R2 model but has not announced a release date.
Read also: DeepSeek’s Impact on the Crypto Market
The content on The Coinomist is for informational purposes only and should not be interpreted as financial advice. While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, or reliability of any content. Neither we accept liability for any errors or omissions in the information provided or for any financial losses incurred as a result of relying on this information. Actions based on this content are at your own risk. Always do your own research and consult a professional. See our Terms, Privacy Policy, and Disclaimers for more details.