Assignment 1: First Steps

In this assignment, you will implement and train your first deep model on a well-known image classification task. You will also familiarize yourself with the Tensorflow library we will be using for this course.

Setting Up

Install Python (3.x is recommended) if you haven’t done so, and install Tensorflow. This should be as simple as writing

pip install tensorflow

in your console. This will install the CPU version; for now, there is no need to bother with the GPU version. Next (optional), download the raw MNIST files from Yann LeCun’s website. MNIST is a collection of handwritten digits and a popular (albeit by now trivialized) benchmark for image classification models. Download our conversion script. Unpack the data, put the script in the same folder and run it as

python conversions.py -c -n

This will create both csv tables and numpy arrays of the data (you don’t need the csv’s but the arrays are created from them). If you want to, you can also append the flag -p to create folders with the actual pictures to get an impression of the data (this will take a bit longer).

Tensorflow Basics

Get started with Tensorflow. There are many tutorials on diverse topics on the website, as well as an API documentation. Note that some of the tutorials contain outdated information, and the API docs can also be lackluster. For now, the following should be sufficient:

Tensorflow Concepts: You need to understand tensors and variables, what a computational graph is, and how to train a model. Ignore the section on tf.estimator for now.
Basic MNIST Tutorial: A logistic regression “walkthrough” both in terms of concepts and code. You will of course be tempted to just copy this code; make sure you understand what each line does. Note that Tensorflow comes with an MNIST dataset already, but if you want to you can download and process the data yourself (see above) and use this simple dataset class.
The Programmer’s Guide has more in-depth articles on many Tensorflow concepts. You can read the ones on Tensors and Variables and maybe the one on Graphs and Sessions.

Play around with the example code snippets. Change them around and see if you can predict what’s going to happen. Make sure you understand what you’re doing.

Building A Deep Model

If you followed the tutorial linked above, you have already built a linear classification model (softmax regression). Next, turn this into a deep model by adding a hidden layer between inputs and outputs. Voilà! You have created a Multilayer Perceptron.

Next, you should explore this model: Experiment with different hidden layer sizes, activation functions or weight initializations. See if you can make any observations on how changing these parameters affects the model’s performance. Also, reflect on the Tensorflow interface: If you followed the tutorials you were asked to, you have been using a very low-level approach to defining models as well as their training and evaluation. Which of these parts do you think should be wrapped in higher-level interfaces? Do you feel like you are forced to provide any redundant information when defining your model?

Bonus

Feel free to explore Tensorflow and MNIST more. For example, there are numerous higher-level APIs that you can try out, such as tf.layers or tf.contrib.learn. Also, the popular Keras library is now integrated in Tensorflow as tf.contrib.keras. These APIs make tasks such as defining models and training and evaluating them a lot easier. Using the low-level interface is important to make sure you understand what is going on in your program, but for serious tasks you will want to use one of the higher-level APIs.

There are also numerous way to explore your model some more. For one, you could add more hidden layers and see how this affects the model. You could also try your hand at some basic visualization and model inspection: For example, visualize some of the images your model classifies incorrectly. Can you find out why your model has trouble with these?

Finally, think about the semantics of your model(s). Can you describe what a specific activation value (in the output as well as in the hidden layer) “means”? Can you do this for the weights that were learned during training? You should start thinking about this for the logistic regression model (having no hidden layers) and then proceed to your MLP. Take note of how much more difficult it becomes to reason about your network as it gets deeper. Visualization is extremely useful here!