News
Claude Opus 4.1 scores 74.5% on the SWE-bench Verified benchmark, indicating major improvements in real-world programming, bug detection, and agent-like problem solving.
Explore Claude Opus 4.1, Anthropic’s groundbreaking new AI model with advanced coding, multilingual, and problem-solving capabilities. Opus AI ...
OpenAI CEO Sam Altman compared GPT-5 to having instant access to a group of PhD-level experts.“People are limited by ideas, ...
Anthropic says Claude Opus 4.1 improves software engineering accuracy to 74.5%. That compares to 62.3% with Claude Sonnet 3.7 ...
Anthropic launched Claude Opus 4.1 today, an upgraded version of its flagship AI model that achieves 74.5% accuracy on ...
The company has unveiled two new open-weight language models, gpt-oss-120b and gpt-oss-20b, marking the firm's first public ...
Anthropic's newly released AI, Claude Opus 4 and Claude Sonnet 4, had many concerning behaviors and resulted in upping their safety measures, the report said.
When set into the fake scenario of being an AI at a pretend company, Claude Opus 4 had access to the email system and found a message stating it would be replaced with a new model. In this case ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results