Assignment 4: `tf.Estimator` & Assorted Programming Puzzles

High-level Modeling in Tensorflow

You have hopefully seen that working with neural networks becomes more comfortable as you move from low- to higher-level interfaces. However, if you have been following the tutorials so far, you have been defining things such as training loops, evaluation procedures etc. yourself. This is rather annoying – compare this to libraries such as scikit-learn where such things can usually be done in a single line of code.

Luckily, Tensorflow also comes with similar high-level interfaces. In this assignment, we will be having a look at Tensorflow’s own Estimator class. Unfortunately, you will need to do quite a bit of extra reading again to get an overview. You can have a look at some or all of the following docs:

The current TF Beginner Tutorial walks you through using pre-built models for the well-known Iris classification task.
The next tutorial seems to be extremely similar but not quite directed at total ML beginners like the above.
The tutorial on checkpoints gives a quick overview over saving and restoring trained models.
There is another tf.data overview.
The final beginner tutorial shows how to build your own models for the Estimator API.

Note that the above tutorials mention “feature columns” quite a lot; feel free to ignore these beyond what is needed to follow along since we won’t be needing them anytime soon.

Aside from the above, there is a short article on Estimators in the Programmer’s Guide. Finally, this tutorial walks you through building a CNN for the MNIST task, allowing you to place the Estimator API in context with the methods you used in previous assignments.

Make sure you have a functioning CNN using this interface, supporting all of training, evaluation and prediction on new inputs. You should have a grasp on these core components of building models with tf.Estimator:

Build an input function, preferably using tf.data.
Build a model function that can operate in train, predict and evaluate modes, usually based on tf.layers functions.
Set up hooks as desired and run the model in the appropriate mode.

Once again, if you haven’t done so, use the Fashion-MNIST dataset instead for more of a challenge. With the more convenient Estimator interface, experimenting with different models/hyperparameters is hopefully more comfortable. Try to achieve the best results you can!

Some Estimator fun facts:

A tf.summary.merge_all() op is automatically set up. No need to include this in your model functions when using Tensorboard.
A tf.summary.scalar is automatically set up for the loss in training mode.
Similarly, a logging hook is set up for the loss when training, and a streaming metric when evaluating. No need to include these yourself.
Never forget: tf.logging.set_verbosity(tf.logging.INFO).

Exploring Tensorflow

Following all those tutorials can be boring, so we will now be focusing on getting to know the Tensorflow core a little more. In the long run, knowing what tools you have available will allow you to write better/shorter/faster programs when you go beyond the straightforward models we’ve been looking at.
Below you will find several (mostly disconnected) small programming tasks to be solved using Tensorflow functions. Most of these can be solved in just a few lines of code, but you will need to find the right tools first. The API docs will be indispensable here. Since these can be a bit overwhelming at first, hints are included for each task.

Given a 2D tensor of shape (?, n), extract the k (k <= n) highest values for each row into a tensor of shape (?, k). Hint: There might be a function to get the “top k” values of a tensor.
As above, but instead of “extracting” the top k values, create a new tensor with shape (?, n) where all but the top k values for each row are zero. Try doing this with a 1D tensor of shape (n,) (i.e. one row) first. Getting it right for a 2D tensor is more tricky; consider this a bonus. Hint: You should look for a way to “scatter” a tensor of values into a different tensor.
Implement an exponential moving average. That is, given a decay rate a and an input tensor of length T, create a new length T tensor where new[0] = input[0] and new[t] = a * new[t-1] + (1-a) * input[t] otherwise. Do not use tf.train.ExponentialMovingAverage. Hint: You might want to have a look at higher order functions to simulate a loop over the input. Alternatively, with the full input already being available you might be able to find a way to do this without looping. Do not use Python loops!
Given three integer tensors x, y, z all of the same (arbitrary) shape, create a new tensor that takes values from y where x is even and from z where x is odd. Hint: An op from Sequence Comparison and Indexing could help.
Given a tensor of arbitrary shape, return the last element if it is 1D, the last row if it is 2D and a scalar 0 otherwise. Hint: You will need some Control Flow Operation for this.
Given two 1D tensors of equal length n, create a tensor of shape (n, n) where element i,j is the ith element of the first tensor plus the jth element of the second tensor. No loops! Hint: Tensorflow supports broadcasting much like numpy.

Assignment 4: tf.Estimator & Assorted Programming Puzzles

High-level Modeling in Tensorflow

Exploring Tensorflow

Assignment 4: `tf.Estimator` & Assorted Programming Puzzles