Assignment 8: Introspection & Advanced Visualizations

In this final assignment, you should try out some more involved visualization approaches. We are asking you to “go out there” and either implement some method(s) yourself, or try out existing tools. Keep the recent reading list in mind. Below, you will find some proposals. Feel free to do something completely different though if you like! Make sure you share cool/interesting findings before the next session so that we can talk about them.

Note: In what follows, we will sometimes refer to “values of interest” in a network. This could mean any of the following (we are assuming the standard MNIST classification task as a default, but you can use whatever task you like):

Probability for a certain class.
Logits for a certain class (why is this different?).
Activations of specific hidden units in an FC layer.
Sum (or average) of activations in a layer (likely hard to interpret).
Sum (or average) over activations of a convolutional filter (i.e. pick a specific filter and reduce over spatial dimensions). Also interesting: Comparing a filter before and after pooling.
Activation of a convolutional neuron (specific filter and spatial positions) or a subset of neurons.

In general, you should try several of these options to get an idea of what the different network parts are doing.

Maximally Activating Dataset Examples

This is probably the simplest thing you could do that allows some insight into hidden layers: Train a network and then run some examples (e.g. the test dataset) through it, recording the value of interest for each example. Then, look at the examples producing the highest such value (e.g. all above a certain threshold, or the maximum k). Try to find the invariances between these examples to create hypotheses as to what the respective unit/layer/whatever preferrably reacts to.

Finding Maximally Activating Inputs via Gradient Ascent

This is comparable to the “Deepvis” reading. The basic setup is fairly simple to implement in Tensorflow:

Train a network as usual and save the parameters.
Re-create the network, but don’t make it trainable (e.g. the tf.layers functions take a trainable argument). Load the trained parameters.
Instead of the usual inputs, define a (trainable!) variable and use that as the input to the network.
As a cost, define the negative of the value of interest.
Train – by minimizing the negative, we are maximizing the value of interest.

Note: When using activations such as tanh, maximization doesn’t necessarily make sense (maximization of the absolute value might be more suitable). However, we are assuming relus as a “default” here.

Assuming you correctly defined the network not to be trainable, the only variable the network can change to achieve the optimization is the input. Visualize whatever comes out of the optimization process. Note that this may differ based on random initalization (different local optima). The results might not look as you expected them to. Here are some tips to make them a little nicer:

tf.get_variable (as well as tf.Variable) allows a constraint argument. E.g. for MNIST, you could pass lambda x: tf.clip_by_value(x, 0, 1) to keep pixels between 0 and 1. Observe what happens to your inputs (min/max values) if you don’t do this, in particular when maximizing something such as class logits. Why is this?
The Deepvis guys note that regularization is necessary to prevent “unnatural” patterns the network is highly sensitive to. A simple regularization you could add is via adding the tf.norm of the input to the cost. Try different norms (e.g. L1, L2). Keep in mind that how much you weight this norm compared to the “actual” cost (value of interest) is very important. In particular, L1 regularization can lead to essentially black images if the weight is even slightly too large.

LSTMVis

In case you are sick of MNIST by now, feel free to try LSTMVis, e.g. with your character RNNs. If you decide that you want to do this, you will need to read the docs for yourself (and definitely share your observations!). Note that the installation instructions only mention Python 2.7. The tool is generally framework-agnostic (you need to export your data into .hdf5 files), but there seems to be a state reader thingy for Tensorflow (linked towards the bottom of the github readme).