Alignment Forum
· Communities
GDM AI Control Roadmap
GDM has published an AI Control Roadmap! From the executive summary:We present the GDM AI Control Roadmap (v0.1) – our plan for implementing and adopting internal guardrails designed to catch potential adversarial behaviour by AI agents, even as they become increasingly harder to oversee and contain.We focus on system-