As we’ve seen in the Tensorflow introduction having access to the computation is a powerful feature. We can define any operation we’d like and tensor flow (or Theano) will compute the gradient and perform the optimisation for us. That’s great!
However if you always define the same kind of operation you’ll eventually find this approach a bit tedious. This is where we need a higher level of abstraction that allows us to define our neural net in terms of layer and not in terms of operations.
For this purpose I’d like to introduce Keras. Keras runs on top of Theano or Tensorflow and allows to quickly and easily define a neural net. For this purpose I am going to run a comparison between the basic MNIST tutorial provided by tensorflow and implementing the same simple network with Keras.
The MNIST dataset is a collection of images of handwritten digits. The dataset is available on Yann LeCun’s website. In the code below the dataset is automatically downloaded using tensorflow utility code.
Following the Tensorflow tutorial we are going to define a neural network with a single layer in order to classify the digits images. First let’s see how to code looks like using Tensorflow (This code is directly taken from the Tensorflow tutorial).
import tensorflow as tf import numpy as np from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) # Neural network definition x = tf.placeholder(tf.float32, [None, 784]) # Input layer of 28x28 images W = tf.Variable(tf.zeros([784, 10])) # Turns 28x28 images into b = tf.Variable(tf.zeros([10])) # Bias u = tf.matmul(x, W) + b y = tf.nn.softmax(u) # Labels y_ = tf.placeholder(tf.float32, [None, 10]) # known outcomes # Loss cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) # Training operation train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) # Evaluation correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
So how many lines do we have here? 17 lines (excluding imports and dataset loading).
Not let’s see how to implement the same network with Keras. But first thing first let’s install Keras. Well it can be install with regular Python package managers:
sudo pip install keras
Good! Now we can implement our network with Keras:
from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.optimizers import SGD model = Sequential() model.add(Dense(input_dim=784, output_dim=10)) model.add(Activation("softmax")) sgd = SGD(lr=0.5, momentum=0.0, decay=0.0, nesterov=False) model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) model.fit(mnist.train.images, mnist.train.labels, nb_epoch=1, batch_size=100) loss_and_metrics = model.evaluate(mnist.test.images, mnist.test.labels) print(loss_and_metrics[1]) # Accuracy
Here everything fits in less than 10 lines. Our simple network definition is only 3 lines of code and you can even choose to run it using Theano or Tensorflow just by changing a variable in Keras configuration.