Excitation Pullbacks: Making AI Transparent

Our method shows what the model really looks at when making a prediction. It amplifies the most important features for the chosen label, and these features turn out to align surprisingly well with human perception. This makes AI decisions easier to understand and more transparent. Future work will enable neuron-specific adjustment of the "temp" hyperparameter, which is expected to significantly enhance explanation quality. For details, check out our paper and its corresponding code repository.

Choose an input image - either sample from Imagenette dataset, select a predefined example or upload your own. Square images are resized to 224x224 pixels, others are first resized to 256x256 and then center-cropped to 224x224 pixels.

Input Image

Example images from Imagenette val (corresponding to Example classes)

Select a target class and amplify it's features - compute Projected Gradient Ascent along the Excitaton Pullback. Very low temperature approximates (noisy) vanilla gradients, while very high temperature linearizes the model.

Target Class (ImageNet)

idx - class name

Model

ImageNet-pretrained ReLU model

Example classes (corresponding to Example images + "ostrich")

Steps

N steps for Projected Gradient Ascent

Alpha

Step size (in L2 norm)

Eps

Maximum perturbation (in L2 norm)

Temp

Temperature for soft gating (sigmoid)

Input image predicted top5 labels

Perturbed / Difference

Perturbed / Input