Models • April 25, 2026

Claude Opus 4.7 Makes a Quiet Leap on Vision and Code

By AI Daily Editorial • April 25, 2026

Anthropic released Claude Opus 4.7 on April 16 with relatively little fanfare, which is unusual given how much it actually changed. The model's headline improvements are in coding and vision, two areas where the gap between Opus 4.7 and its predecessor is not marginal but dramatic. On Anthropic's internal visual-acuity benchmark, Opus 4.7 scores 98.5 percent. Opus 4.6 scored 54.5 percent. That is not iteration; it is a different capability class.

The vision improvement is anchored in a technical change: Opus 4.7 accepts images up to 2,576 pixels on the long edge, more than three times the resolution of previous Claude models. In practice this means the model can work with high-resolution diagrams, detailed schematics, and dense visual layouts without losing information to compression. For teams doing document analysis, computer vision pipelines, or pixel-accurate reference work, that matters considerably.

On code, the improvement is more nuanced but arguably more consequential. Opus 4.7 achieves a 13 percent resolution lift on an internal 93-task benchmark, and resolves three times as many production tasks as Opus 4.6 on SWE-bench, the industry's standard measure for real software engineering work. The model also shows 21 percent fewer errors on enterprise document reasoning tasks and reaches state-of-the-art on Anthropic's Finance Agent evaluation, scoring 0.813 against 4.6's 0.767. The common thread across these gains is the model's ability to handle complex, long-running tasks with more consistency, verifying its own outputs before reporting rather than racing to a response.

Pricing stays flat at $5 per million input tokens and $25 per million output, which gives developers a meaningful upgrade at the same cost. There is a catch in the tokenizer: Anthropic updated it alongside the model, meaning the same inputs may consume 1.0 to 1.35 times more tokens depending on content type. For high-volume workloads, that adds up and is worth factoring into cost projections before migration.

A new "xhigh" effort level sits between the existing "high" and "max" options, giving developers finer control over how hard the model works on a given task. That kind of granularity matters for agentic workflows where you want the model to push hard on specific steps without running the whole session at maximum cost.

There is context worth noting: Anthropic's own documentation describes Opus 4.7 as "less broadly capable than Claude Mythos Preview," suggesting the company has a more powerful model in limited preview. Opus 4.7 sits between its predecessor and a frontier that is not yet publicly available. On the Terminal-Bench 2.0 agentic coding test, it scores 69.4 percent against GPT-5.5's 82.7 percent, a gap that is real. But on Humanity's Last Exam, which tests abstract reasoning without tools, Opus 4.7 scores 46.9 percent, ahead of GPT-5.5's 43.1 percent. The two models are strong in different places, and depending on what you are building, the distinction matters.

Claude Opus 4.7 Makes a Quiet Leap on Vision and Code

Sources