Introduction to Deep Learning
tf.Estimator & Assorted Programming PuzzlesYou have hopefully seen that working with neural networks becomes more comfortable as you move from low- to higher-level interfaces. However, if you have been following the tutorials so far, you have been defining things such as training loops, evaluation procedures etc. yourself. This is rather annoying – compare this to libraries such as scikit-learn where such things can usually be done in a single line of code.
Luckily, Tensorflow also comes with similar high-level interfaces. In this assignment, we will be having a look at Tensorflow’s own Estimator class. Unfortunately, you will need to do quite a bit of extra reading again to get an overview. You can have a look at some or all of the following docs:
tf.data overview.Estimator API.Note that the above tutorials mention “feature columns” quite a lot; feel free to ignore these beyond what is needed to follow along since we won’t be needing them anytime soon.
Aside from the above, there is a short article on Estimators in the Programmer’s Guide. Finally, this tutorial walks you through building a CNN for the MNIST task, allowing you to place the Estimator API in context with the methods you used in previous assignments.
Make sure you have a functioning CNN using this interface, supporting all of training, evaluation and prediction on new inputs. You should have a grasp on these core components of building models with tf.Estimator:
tf.data.train, predict and evaluate modes, usually based on tf.layers functions.Once again, if you haven’t done so, use the Fashion-MNIST dataset instead for more of a challenge. With the more convenient Estimator interface, experimenting with different models/hyperparameters is hopefully more comfortable. Try to achieve the best results you can!
Some Estimator fun facts:
tf.summary.merge_all() op is automatically set up. No need to include this in your model functions when using Tensorboard.tf.summary.scalar is automatically set up for the loss in training mode.tf.logging.set_verbosity(tf.logging.INFO).Following all those tutorials can be boring, so we will now be focusing on getting to know the Tensorflow core a little more. In the long run, knowing what tools you have available will allow you to write better/shorter/faster programs when you go beyond the straightforward models we’ve been looking at.
Below you will find several (mostly disconnected) small programming tasks to be solved using Tensorflow functions. Most of these can be solved in just a few lines of code, but you will need to find the right tools first. The API docs will be indispensable here. Since these can be a bit overwhelming at first, hints are included for each task.
(?, n), extract the k (k <= n) highest values for each row into a tensor of shape (?, k). Hint: There might be a function to get the “top k” values of a tensor.(?, n) where all but the top k values for each row are zero. Try doing this with a 1D tensor of shape (n,) (i.e. one row) first. Getting it right for a 2D tensor is more tricky; consider this a bonus. Hint: You should look for a way to “scatter” a tensor of values into a different tensor.a and an input tensor of length T, create a new length T tensor where new[0] = input[0] and new[t] = a * new[t-1] + (1-a) * input[t] otherwise. Do not use tf.train.ExponentialMovingAverage. Hint: You might want to have a look at higher order functions to simulate a loop over the input. Alternatively, with the full input already being available you might be able to find a way to do this without looping. Do not use Python loops!x, y, z all of the same (arbitrary) shape, create a new tensor that takes values from y where x is even and from z where x is odd. Hint: An op from Sequence Comparison and Indexing could help.n, create a tensor of shape (n, n) where element i,j is the ith element of the first tensor plus the jth element of the second tensor. No loops! Hint: Tensorflow supports broadcasting much like numpy.