Excitation Pullbacks - looking behind the curtain of gradient noise

We present a potent explanation method for (pretrained) Deep Neural Networks, the Excitation Pullback, which is a simple modification of the vanilla gradient. For details, check out our paper and its corresponding code repository. Note that there is still a lot of room for improvement as we use just a single architectural hyperparameter (temp) for every hidden neuron (ideally, it should be adjusted for every neuron independently).

Select an input image - either sample from Imagenette dataset, a predefined example or upload your own. Square images are resized to 224x224 pixels, others are first resized to 256x256 px and then center-cropped to 224x224 pixels.

Input Image

Example images from Imagenette val (corresponding to Example classes)

Select a target class and generate a (counterfactual) explanation - a perturbation toward that class along the excitaton pullback (via Projected Gradient Ascent). Very low temperature approximates vanilla gradients, while very high temperature linearizes the model.

Target Class (ImageNet)

idx - class name

Model

ImageNet-pretrained ReLU model

Example classes (corresponding to Example images + "ostrich")

Steps

N steps for Projected Gradient Ascent

Alpha

Step size (in L2 norm)

Eps

Maximum perturbation (in L2 norm)

Temp

Temperature for soft gating (sigmoid)

Input image predicted top5 labels

Perturbed / Difference

Perturbed / Input