r/LocalLLaMA June 29, 2026 · Communities

Is it ever possible to have a malicious LLM with a backdoor

I was just brainstorming of possibilities that the LLMs behave differently than normal if trained to recognize a specific secret sentence, and then unlocks a backdoor of malicious behavior. This sounds to me very possible at first glance. Don't get me wrong, the risk is relevant for ALL LLMs (closed & open ones), as lo

Read original