LessWrong AI June 25, 2026 · Communities

Exploring Generalization in NLA's

Recently, I was reading anthropic's paper on NLA's[1] and for a person who works on steering, it was an interesting and thought-provoking paper. In this post I would like to go through my reproduction and some of the experiments I did on them.Training and ArchitectureI'm going to touch little on architecture here becau

Read original