Some Motivation as why you should study Deeplearning :Neural network are not invented in 20th century infact the Perceptron model was invented in 1957 by a psychologist Frank Rosenblatt ,one of the simplest artificial neuron model which was highly inspired by biological neuron .After There was a theory to combine many such models aka neuron’s to create a network of neuron ,this was also inspired by the the design of biological neural network and named as Artificial neural network .In 1980’s Geoffery hinton and his team successfully created an algorithm named Backpropagation which is based on chain rule of differentiation to train these neural network.People start believing that in roughly about 10yrs we will have computers that will start thinking like humans but it all went wrong and after 1990’s no one looked much into neural network’s till 2012 when Geoffery hinton and his team win Imagenet competition by a huge margin using a specific deep neural network architecture called convolutional neural network (CNN, or ConvNet) , after this Everyone start taking neural network and Deep learning (Sub area of machine learning ) seriously.
Why neural network didn’t perform previously and why did they are performing now ?
- There was not enough data to feed into these architecture : Neural network require’s large amount of data to get trained mostly in GB’s or sometimes even in TB’s depending upon type of problem.
- Very less computing power to actually run these architecture: Neural network is all about series of matrix multiplication and this can be really computationally expensive when matrix size is large and we all know what was the level of hardware at that time.
Why neural network works better than all traditional machine learning algorithms ?
This is a very important question which must be answered as if we don’t know why something is better ,then we cannot harness all of its power .
Let’s attack this question by using a series of examples,
the above data points lie roughly as y =x, now if we try to apply a linear regression model to approximate this data we can get a line that best fits on the this data.
There are some key observations which must be noted down from above :
- The function which best approximate this data is linear i.e a straight line
- The data is not random
- The above best fit function is not composite function.
The observation 1 and 3 are the most important observation which require’s further discussion,
Now Lets consider another example ,
the above graph is of sin(x^2)+ log(2*x) ,Now if we apply linear regression model over this we will get our best fit line which will look something like this ,
This is the best fit line we can get by using a linear regression model but this will give prediction with high error , the question is why ?
- The above function is not a linear function
- the above function is a combination of two function sin(x^2) and log(2*x)
- Linear regression can only maps linear function until we explicitly code polynomial features
Now lets built a architecture for above function ,
In above architecture the one nodes calculates then next calculates sin of and so on for complete function. We can clearly see that the above architecture resembles a neural network architecture So can we say that a neural network’s are Universal Function Approximator’s why approximator’s as in real life we don’t know the exact function that represents that data all we do is training our network to find the best function which will approximate our data.
Let’s formalize our discussion , consider this sample neural network architecture
In this we have input layer ,one hidden layer and a output layer so to express the output of this neural network in terms of function can be expressed as
Which is a composite function ,where
- I:Input layer
- H:Hidden layer
- O:Output layer
- X’s : Inputs to input layer
We will cover Backpropagation in the next article so stay updated.