Module `contrib.semseg.manipulation`

Manipulate representations by increasing or decreasing the presence of a feature in a ViT activation, then use the linear probe for inference.

Record class-specific scores before and after manipulation to see that you can directly manipulate abilities to complete downstream tasks.

Functions

def main(cfg: Manipulation)
def manipulate(cfg: Manipulation, sae: SparseAutoencoder, acts_BWHD: jaxtyping.Float[Tensor, 'batch width height d_vit']) ‑> tuple[jaxtyping.Float[Tensor, 'batch width height d_vit'], jaxtyping.Float[Tensor, 'batch width height d_vit']]