Introduction to tensorflow Part 4

If you are new to tensorflow, we recommend to start this series from part 1

This is part 4 of tensorflow series. The goal of every single article is to make you understand one of the most popular deep learning library out there. This series is entirely focused on beginners, who are either starting to learn deep learning and want to built their own neural nets or want to build state-of-the-art neural network’s in tensorflow. This series of article will specifically covers basics of tensorflow for python, there is another version of tensorflow.js for javascript developers but that will be covered in next series.In this part of the series we will be building a CNN  for MNIST dataset using tensorflow.

After reading this article you will

  1.  be able to build a cnn using keras as tensorflow backend
  2. learn about keras api for building cnn using tensorflow as backend

Before moving further, it is recommended for the readers to fresh up their cnn basics by clicking here.

A quick introduction to keras

Keras is high level neural network api which can be be used both with either theano or tensorflow as its backend, by backend I am referring to the methods which keras will be calling to execute the task (for illustration see fig) .

 

keras was first built as high level api for only theano but after tensorflow become opensource, keras also started supporting tensorflow and now most developers uses tensorflow as their keras backend.  You must be wondering why to use a layer at top of tensorflow, the answer is simple,

  1. Tensorflow code tends to become larger as the network size grows
  2. Due to the fact that using tensorflow you can control each and every minute detail of a neural net, which is not often required.
  3. The size of the code significantly reduces.
Importing Libraries and dataset

Importing libraries and dataset, the mnist dataset is already available in keras framework all you have to do is import it.


import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten
from keras import backend as K

for the simplicity, we will not be using dropouts.

Defining model parameters

This is same as we did in MLP, we defined no of neurons in hidden layer or no of output classes etc. The same procedure has to be followed in CNN.


size_of_batch = 256
output_layer= 10
epochs = 12
height, width = 28, 28

At this point I am making an assumption that you are already familiar with terms batch size or epoch, if not then click here.

Loading data

By using load_data we can load mnist, the load_data function returns two tuples in form of train and test data.


(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0],height, width, 1)
X_test = X_test.reshape(X_test.shape[0],height, width, 1)
size_of _input = (height, width, 1)

Converting class into binary vectors

The output labels i.e Y_train or Y_test are not in form of binary one-hot-encoded vectors.


Y_train = keras.utils.to_categorical(Y_train, output_layer)
Y_test = keras.utils.to_categorical(Y_test, output_layer)

Defining Model

This is the essential part of this article, the model building procedure is quite simple, but before explanation let’s first take a look at the code itself.


model = Sequential()  #The model is sequential i.e one layer is followed by another don't get confused with sequence models 
model.add(Conv2D(10, kernel_size=(3, 3), activation='relu', input_shape=size_of _input)) #convolution layer with 10 3x3 kernals
model.add(Conv2D(32, (3, 3), activation='relu')) #convolution layer with 32 3x3 kernals
model.add(MaxPooling2D(pool_size=(2, 2)))  #MaxPool layer with size 2x2
model.add(Flatten()) #flattening the output of maxpool layer so as to feed in a dense fully connected layer
model.add(Dense(20, activation='relu')) #defining a fully connected layer just as a hidden layer in mlp
model.add(Dense(output_layer, activation='softmax')) #OutputLayer

For explanation read the comments

Compiling the model

model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.SGD(),
metrics=['f1_score'])

The syatax is self-explanatory, as we are compiling the model with cross_entropy i.e log-loss using stochastic gradient descent using a metric f1_score.

Fitting the model over data

This is the last part as we are fitting the model over mnist data


model.fit(X_train, Y_train, batch_size=size_of_batch, epochs=epochs, verbose=0, validation_data=(X_test, Y_test))

If you want the output at each iteration make verbose=1.

Calculating performance of model over test data

score = model.evaluate(X_test, Y_test, verbose=0)
print('Test accuracy:', score[1])

 

Congrats you have successfully build a Convolution neural network using keras api with tensorflow as backend.Convolution neural network is a very important neural network architecture and knowing how to build it gives you a great advantage over other machine learning engineers. This was the last part of this series we will be covering many articles over machine learning in future so stay connected.

 

Send a Message