arXiv cs.NE
· Papers
Distributed Quality-Diversity Search for Toxicity in Large Language Models
arXiv:2606.24166v1 Announce Type: new Abstract: Large Language Models remain vulnerable to adversarial prompts that elicit harmful responses, and scaling red-teaming to cover a broad range of failure modes is constrained by the cost of text generation and evaluation. We present emph{ToxSearch-S}, a speciated extension