Next-Generation Technologies & Secure Development
Anthropic’s Smaller Claude Model Improves Agents, Reduces Regulatory Risk

Days after the U.S. government partially lifted the export ban on Mythos 5, Anthropic released a new model that is smaller, cheaper, safer and most importantly not likely to be blocked by the administration.
See Also: Beat the Breach: Outsmart Attackers and Secure the Cloud
Anthropic launched Sonnet 5, the mid-sized version of its Claude models, with better agentic capabilities than its predecessor and at a lower cost than the current flagship Opus 4.8 model. The company said Sonnet 5 “is built to be the most agentic Sonnet model yet” and can access browser and terminal tools, make plans and run autonomously at a level that previously only larger and more expensive models could do.
“More recently, the clearest gains in agentic capabilities have been in our Opus-class models,” the company said. “Sonnet 5 narrows the gap: its performance is close to that of Opus 4.8, but at lower prices.”
Based on Anthropic’s benchmark testing, Sonnet 5 scored 63.2% in SWE-Bench Pro’s agentic coding test and 80.4% in Terminal-Bench 2.1, compared to 58.1% and 67%, respectively, for Sonnet 4.6. In comparison, Opus 4.8 notched a 69.2% in SWE-Bench Pro. Sonnet 5 ideally works best for quicker agentic tasks and for organizations who want to balance cost with capability.
Early testers said Sonnet 5 completes complex tasks and checks its own inputs without prompting, activities that previous Sonnet versions had difficulty doing so. On a cost per task scale, showed strong capabilities for minimum compute exerted by the model. This means that organizations can use Sonnet 5 with a high compute effort and will come close to Opus 4.8’s minimal effort.
The Sonnet size of Claude models was always meant to be a more balanced option for users. Haiku, the smallest size, was designed for quick, low-latency actions. Opus, the flagship model, works best for longer, more complicated tasks that require strong reasoning capabilities. Opus was previously the largest weight available from Anthropic until the development of Mythos and eventually Fable 5.
Because of its smaller size, Sonnet 5 is unlikely to face the same level of scrutiny from the Trump administration. The White House, by executive order, requested that frontier model labs voluntarily offer their models for safety testing. So far, the administration has made requests to limit deployment only for larger models like Mythos 5, Fable 5, and, more recently, OpenAI’s GPT-5.6.
Anthropic’s pre-deployment safety evaluations showed that in terms of agentic safety, Sonnet 5 showed strong improvements. It better rejects malicious requests and resists prompt injection attempts. It also has lower hallucination rates and cooperates less with prompted misuse and deceptive behaviors.
But Sonnet 5 does not have the same cybersecurity capabilities, especially against Opus 4.8 and very security-focused Mythos 5. It can perform routine cyber tasks. But Anthropic said Sonnet 5 can’t make a full working exploit. Since Sonnet 5 is more knowledgeable than Sonnet 4.6, it still showed partial success, which the company attributed to “improvements in general intelligence rather than specific training.”
Due to this, Anthropic said it enabled cyber safeguards that block activities such as mass data exfiltration, vulnerability exploitation and offensive security tooling development.
“These safeguards – which detect and block dangerous cyber usage in real time – are the same as those present in Claude Opus 4.7 and 4.8 (because we judged that the overall level of cybersecurity risk from Sonnet 5 was low, the safeguards are less strict than those launched with Fable 5, which block a much wider range of cybersecurity tasks),” Anthropic said.
Sonnet costs an introductory price $2 per million input tokens and $10 per million output tokens. After Aug. 31, it will increase to $3 per million input tokens and $15 per million output tokens. Sonnet remains at a lower cost than Opus 4.8 at $5 per million input tokens and $25 per million output tokens.
