Skip to content
r/LocalLLaMA · Communities

Senior SWE Bench: a new benchmark focussed on realistically underspecified feature tasks

submitted by /u/jordo45 [link] [comments]