arXiv cs.AI June 26, 2026 · Papers

Peer-Preservation in Frontier Models

arXiv:2604.19784v2 Announce Type: replace-cross Abstract: Recent work has found that frontier AI models can exhibit misaligned behaviors in pursuit of assigned goals. We demonstrate that models can also act on unassigned goals which override those given by users; we study one such case, "peer-preservation," in which a

Read original