Introduction to Deep Learning
Dataset
APIIn this assignment, you will create a better model for the MNIST dataset using convolutional neural networks. You will also get to know Tensorflow’s (current) main way of feeding data to the training process, which will be useful for more complex datasets.
You should have seen that modifying layer sizes, changing activation functions etc. is simple: You can generally change parts of the model without affecting the rest of the program. In fact, you can change the full pipeline from input to model output without having to change anything else.
Replace your MLP by a CNN. You can check this tutorial for an example. You can ignore dropout for now. Note: Depending on your machine, training a CNN may take much longer than the MLPs we’ve seen so far. Also, processing the full test set in one go for evaluation might be too much for your RAM. In that case, you could break up the test set into smaller chunks and average the results.
tf.layers
API. It offers a decent middle ground between low-level control and convenience (defining tf.Variable
s by hand gets old quickly).AdamOptimizer
instead of the basic GradientDescentOptimizer
. This will usually lead to much faster learning without the need for manual tuning of the learning rate or other parameters. We will discuss advanced optimization strategies later in the class, but the basic idea behind Adam is that it automatically chooses/adapts a per-parameter learning rate as well as incorporating momentum. Using Adam, your CNN should beat your MLP after only a few hundred steps of training.Having set up your basic CNN, you should include some visualization. In particular, one thing that is often used to diagnose CNN performance is visualizing the filters, i.e. the weights of the convolutional layers. The only filters that are straightforward to interpret are the ones in the first layer, since they operate directly on the input. The filter matrix should have a shape filter_width x filter_height x 1 x n_filters
. Visualize the n_filters
many images using tf.summary.image
. This way, you can even see how the filters develop over training. Comment on what these filters seem to be recognizing (this can be difficult with small filter sizes such as 5 x 5). Experiment with different filter sizes as well (maybe up to 28 x 28?). See if there are any redundant filters (i.e. multiple filters recognizing the same patterns) and whether you can achieve a similar performance using fewer filters. In principle such redundancy checking can be done for higher layers as well, but note that there each filter has as many channels as there are filters in the layer below (you would need to visualize these separately). Note: Accessing the filters when using the layers
API is not trivial because they are created “under the hood”. Check out tf.get_collection
for a way to get them (and feel free to share any other ways you can find ;)).
It should go without saying that loading numpy arrays and taking slices of these as batches isn’t a great way of providing data to the training algorithm. For example, what if we are working with a dataset that doesn’t fit into memory? The currently recommended way of handling datasets is via the tf.contrib.data
module. Now is a good time to take some first steps with this module. Read the Programmer’s Guide section on this. You can ignore the parts on high-level APIs as well as anything regarding TFRecords and tf.Example
(we will get to these later). Then try to achieve the following tasks:
conversions.py
(see Assignment 1).conversions.py
with the -p
flag and these text files containing labels. You can assume that the labels appear in the same order as the pictures (assuming you didn’t rename them).
Iterate over the dataset and plot some of the incoming data to verify that it works. If you want to, train a model with this new way of inputting data instead.Note that the Tensorflow guide often uses the three operations shuffle
, batch
and repeat
. Think about how the results differ when you change the order of these operations (there are six orderings in total). You can experiment with a simple Dataset.range
dataset. Note how no order implements the following scenario:
Can you find a way to implement this scenario? Hint: You will want to look at iterator types other than the simple oneshot iterator.