Train A One Layer Feed Forward Neural Network in TensorFlow With ReLU Activation, Softmax Cross Entropy with Logits, and the Gradient Descent Optimizer
Train A One Layer Feed Forward Neural Network in TensorFlow With ReLU Activation, Softmax Cross Entropy with Logits, and the Gradient Descent Optimizer
To train our model, we need to tell the model what the correct answer is and we're going to do that by feeding in the correct answers.
# Code up to now:
#
# create-simple-feedforward-network.py
#
# to run
# python numpy-arrays-to-tensorflow-tensors-and-back.py
#
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, shape=[None, 784])
W = tf.get_variable("weights", shape=[784, 10],
initializer=tf.glorot_uniform_initializer())
b = tf.get_variable("bias", shape=[10],
initializer=tf.constant_initializer(0.1))
y = tf.nn.relu(tf.matmul(x, W) + b)
We're going to need a placeholder variable.
y_ = tf.placeholder(tf.float32, [None, 10])
The placeholder is how we will be able to input data into our TensorFlow graph.
However, you'll notice here that instead of having a 784-dimensional vector, we have a 10-dimensional vector.
y_ which represents the correct values that we are trying to get our neural network to learn is a 10-dimensional vector as each vector corresponds to the true probability for each of the different classes, namely 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
In this case, the probability will either be 1 or 0 for each class.
For only one of the classes will it be 1 and the rest will be 0.
Once we have defined our predictions and then the true labels, we're going to use cross entropy to compare them and to produce a numerical value of how close our answer is to the correct answer.
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_)
Because we're feeding in the raw output of the ReLU, we're going to call it tf.nn.softmax_cross_entropy_with_logits.
Then what we're going to do is we actually have to define how our models train.
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)
We use the TensorFlow GradientDescentOptimizer to do that.
There are a number of different optimizers which I would recommend that you read more about.
So what this is saying is that using the TensorFlow GradientDescentOptimizer with a learning rate of 0.001, minimize the variable that we've defined as our cross entropy.
Our cross entropy here is defined as the cross entropy between logits y and the labels y_, again with the outputs of our model and the true values.
Once we have that, we're going to call a TensorFlow session.
sess = tf.InteractiveSession()
Then we're going to initialize all of our variables.
tf.global_variables_initializer().run()
Once we've set up our model, we're going to need to train it.
We're going to train it for 50 steps which we'll handle just using a standard Python for loop.
for step in range(50):
There will be two steps that happen for each iteration of our loop.
First of all, we have to get data from the input_data.read_data_sets() function.
And we do that by calling the train.next_batch method on our MNIST object.
for step in range(50):
batch_xs, batch_ys = mnist.train.next_batch(100)
What this says is from the training set, pull a new batch of 100 samples from there.
And then we run that through our model.
for step in range(50):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_cs, y_:batch_ys})
We will use a feed_dict there.
That is the main method in which we input data into TensorFlow.
Just to show that this runs, we're going to produce this and it should return without any errors.
# Command Line
~ > python create-simple-feedforward-network.py
However, that’s pretty boring.
So just to make sure that it's doing something, we'll tell it to print out what step it's on.
for step in range(50):
print(f"training step: {step}")
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_cs, y_:batch_ys})
You'll see at first that it’s quite slow as it’s downloading the data.
# Command Line
~ > python create-simple-feedforward-network.py
But then, it’s quite fast and you can see that it trains successfully.
Full Source Code For Lesson
# create-simple-feedforward-network.py
#
# to run
# python numpy-arrays-to-tensorflow-tensors-and-back.py
#
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, shape=[None, 784])
W = tf.get_variable("weights", shape=[784, 10],
initializer=tf.glorot_uniform_initializer())
b = tf.get_variable("bias", shape=[10],
initializer=tf.constant_initializer(0.1))
y = tf.nn.relu(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_)
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for step in range(50):
print(f"training step: {step}")
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_cs, y_:batch_ys})
Receive the Data Science Weekly Newsletter every Thursday
Easy to unsubscribe at any time. Your e-mail address is safe.