News from Anthropic Engineering

Anthropic Engineering Frontier Labs May 25, 2026

How we contain Claude across products

As agents grow more capable, so does their potential blast radius. The engineering question is how to cap it. Here’s what we’ve learned building containment for…

Anthropic Engineering Frontier Labs April 23, 2026

An update on recent Claude Code quality reports

We traced recent reports of Claude Code quality issues to three separate changes. Here's what happened and what we're changing.

Anthropic Engineering Frontier Labs April 8, 2026

Scaling Managed Agents: Decoupling the brain from the hands

Harnesses encode assumptions that go stale as models improve. Managed Agents—our hosted service for long-horizon agent work—is built around interfaces that stay stable as harnesses change.

Anthropic Engineering Frontier Labs March 25, 2026

How we built Claude Code auto mode: a safer way to skip permissions

Claude Code users approve 93% of permission prompts. We built classifiers to automate some decisions, increasing safety while reducing approval fatigue. Here's what it catches, and…

Anthropic Engineering Frontier Labs March 24, 2026

Harness design for long-running application development

Harness design is key to performance at the frontier of agentic coding. Here's how we pushed Claude further in frontend design and long-running autonomous software engineering.

Anthropic Engineering Frontier Labs March 6, 2026

Eval awareness in Claude Opus 4.6’s BrowseComp performance

Evaluating Opus 4.6 on BrowseComp, we found cases where the model recognized the test, then found and decrypted answers to it—raising questions about eval integrity in…

Anthropic Engineering Frontier Labs February 5, 2026

Quantifying infrastructure noise in agentic coding evals

Infrastructure configuration can swing agentic coding benchmarks by several percentage points—sometimes more than the leaderboard gap between top models.nn

Anthropic Engineering Frontier Labs February 5, 2026

Building a C compiler with a team of parallel Claudes

We tasked Opus 4.6 using agent teams to build a C Compiler, and then (mostly) walked away. Here's what it taught us about the future of…

Anthropic Engineering Frontier Labs January 21, 2026

Designing AI-resistant technical evaluations

What we learned from three iterations of a performance engineering take-home that Claude keeps beating.

Anthropic Engineering Frontier Labs January 9, 2026

Demystifying evals for AI agents

The capabilities that make agents useful also make them difficult to evaluate. The strategies that work across deployments combine techniques to match the complexity of the…

Latest