arXiv cs.AI
· Papers
Peer-Preservation in Frontier Models
arXiv:2604.19784v2 Announce Type: replace-cross Abstract: Recent work has found that frontier AI models can exhibit misaligned behaviors in pursuit of assigned goals. We demonstrate that models can also act on unassigned goals which override those given by users; we study one such case, "peer-preservation," in which a