METR
· Tech Media
Many SWE-bench-Passing PRs Would Not Be Merged into Main
.content figure figcaption p { font-weight: normal; } Summary: We find that roughly half of test-passing SWE-bench Verified PRs written by mid-2024 to mid/late-2025 agents would not be merged into main by repo maintainers, even after adjusting for noise in maintainer merge decisions. Since the agents are not given a ch