EleutherAI December 12, 2024 · Open Source

SAEs trained on the same data don’t learn the same features

In this post, we show that when two TopK SAEs are trained on the same data, with the same batch order but with different random initializations, there are many latents in the first SAE that don't have a close counterpart in the second, and vice versa. Indeed, when training only about 53% of the features are shared Furt

Read original