LessWrong AI
· Communities
Some subtypes of taskishness / corrigibility
"Corrigibility" is somewhat of an overloaded term in alignment - it points in the direction of a cluster of desirable properties, but different people have different ideas of what this entails.I think of "corrigibility", as it is used, to cover a few different ideas. I will name some of these and sort them roughly in o