First steps with TensorFlow

Part 1 – Basics

First steps with TensorFlow

TensorFlow is everywhere these days, it is apparently becoming the library of choice for deep learning applications, and, due to recent advances in hardware technology (TPU performance), might even gain more momentum in the near future.

The main driver for using TensorFlow is to build deep learning systems, and for an experienced developer it is tempting to dive right into the advanced stuff like CNNs and RNNs. TensorFlow offers support of the most common deep learning architectures out of the box and a lot of additional resources are available online. Playing with these things can be extremely fun, see, for example, the famous article by Andrej Karpathy on character RNNs (a TensorFlow implementation of character RNNs is here).

Things can get frustrating, however, if you want to move beyond prefabricated examples and try your own modifications and new ideas: Example code is cluttered with parsing command-line options, instrumenting TensorBoard, and so on. The odds are that module and function names have changed since the example code was written. TensorFlow documentation of, say, RNNs on the other hand assumes that you already have a deep understanding of the library and common usage patterns.

At that point for me at least it was time to get back to the bare bones and to understand all the moving parts of TensorFlow first.

Hello World

Let’s get our hands dirty and run a simple computation in TensorFlow, i.e. calculating the sum of two floats. We initialize TensorFlow by importing the module:

import tensorflow as tf
#

Performing a computation in TensorFlow is slightly different from doing the same computation in plain python. One first needs to define the structure (“graph”) of the computation, then start a TensorFlow environment (“session”) for the graph, and finally execute the graph in the context of the session. We define the input parameters and their associated data types.

param_x = tf.placeholder(dtype=tf.float32)
param_y = tf.placeholder(dtype=tf.float32)
#

We define the addition operation on param_x and param_y using the TensorFlow built-in function tf.add().

op_x_plus_y = tf.add(param_x, param_y)
#

Now the computation is defined and we create a TensorFlow session

sess = tf.Session()
#

and use it to evaluate op_x_plus_y passing to it the values 20 for param_x and 1.1 from param_y:

result = sess.run(op_x_plus_y, feed_dict={param_x: 20, param_y: 1.1})
#

As you can see the evaluation is triggered by the method sess.run() and the input parameters are passed as a python dictionary feed_dict. We print the result

result
#
21.1
#

and see that TensorFlow got the calculation right: \( x = 20 \), \( y = 1.1 \), \( x + y = 21.1 \)

Finally, in order to make this a proper “Hello, World!” example, we create a slightly more sophisticated variant.

magic_numbers = tf.placeholder(dtype=tf.int32, shape=[None])
offset = tf.placeholder(dtype=tf.int32)
magic_numbers_plus_offset = magic_numbers + offset
#
magic_numbers_plus_offset = sess.run(magic_numbers_plus_offset, feed_dict={offset: 10, magic_numbers: [62, 91, 98, 98, 101, 34, 22, 77, 101, 104, 98, 90, 23]})
[chr(i) for i in magic_numbers_plus_offset]
#
['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd', '!']
#

This example contains two novelties

  • The parameter shape=[None] in the first call to tf.placeholder, which indicates that the input is one-dimensional array of unknown size.
  • The add operation is created using the overloaded operator + rather than tf.add()

We skipped the creation of a session because the session from the previous example was still active.

Basic TensorFlow mechanics

The hello world example was brushing over the building blocks of the TensorFlow runtime environment: graphs, sessions and devices.

Graphs

Any computation in TensorFlow needs to be defined in the context of a graph. In the examples above we did not notice the presence of a graph because the tf.placeholder() and tf.add() statements were implicitly using the default graph. Accordingly the session created by tf.Session() was associated with the default graph. A graph in the TensorFlow sense defines the set of computations which can be performed by a TensorFlow program. The term is misleading insofar as a TensorFlow graph may actually be a collection of disjoint graphs which can be executed independently.

tf.get_default_graph()
#
tensorflow.python.framework.ops.Graph at 0x23286632748;
#

We can get all the operations (nodes and edges) of the graph

tf.get_default_graph().get_operations()
#
[tf.Operation 'Placeholder' type=Placeholder,
 tf.Operation 'Placeholder_1' type=Placeholder,
 tf.Operation 'Add' type=Add,
 tf.Operation 'Placeholder_2' type=Placeholder,
 tf.Operation 'Placeholder_3' type=Placeholder,
 tf.Operation 'add' type=Add]
#

As we can see, we have inadvertently cluttered the default graph with the operations from the two previous examples. We therefore clean up the default graph

tf.reset_default_graph()
tf.get_default_graph().get_operations()
#
[]
#

and redo the first example above in a separate graph

g1 = tf.Graph()
#
with g1.as_default():
    param_x = tf.placeholder(dtype=tf.float32, name = 'x')