Saev
saev is a package for training sparse autoencoders (SAEs) on vision transformers (ViTs) in PyTorch. It also includes an interactive webapp for looking through a trained SAE's features.
API reference docs are available below, as well as the source code on GitHub.
My logbook is a set of notes that might also be useful.
Package Docs
- saev
Package to train SAEs for vision models.
- probing
Package for probing for individual features in trained SAEs.
- faithfulness
Package for measuring SAE feature faithfulness through feature manipulation.