A Deep Look @ Deep Learning — Part 01
Deep learning is a booming topic these days. Everyone talks about it. Everyone tries to apply it to various problems. But what is it? To understand Deep learning, we will need to take a look back at artificial intelligence.
Artificial intelligence can be introduced as an attempt to build something like the natural intelligence of the human brain artificially. The truth is that the human brain is so powerful that no one could ever have understood the whole concept behind it or implemented it. The dream of making a machine that can do all the thinking like a human is thousands of years ahead. This is called Artificial General Intelligence. We might not require taking precautions against the robots coming to take over the world in the near future because the technology is not that matured.
However, there are smart machines doing specific tasks more than that a human can do. This intelligence is called Artificial Narrow Intelligence. We can talk about various ways to achieve this type of intelligence.
Let’s take a famous example, Tic-tac-toe game. Yes, a human can play this well. But this is chosen for simplicity.
We can write a program which can play this game. Since we know all the positions and permutation space is small, all the moves that the program should play can be explicitly configured.
Another way of doing this is by introducing an algorithm like Minmax. This can be a set of instructions to play the next move based on the current position. There are other ways of doing this by looking at real moves played by humans and training the program to play like that or letting the program to play without any experience and learn eventually by successes and failures.
Such programs become intelligent since the performance of the task they try to achieve can be improved by the experience taken. The field of Machine Learning comes into the picture this way. According to Wikipedia,
“ A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E “
These kinds of programs use machine learning algorithms such as decision trees, support vector machines, etc to perform the task. These are not explicitly instructed. They have a model that can be trained with the use of real data. This is the way of giving experience to the machine. For example, the Tic-tac-toe game program can be replaced with a machine learning model that is trained with real human moves.
These programs can also be modeled thinking of how the human brain would do it. The special model architecture inspired by the human brain is called an Artificial neural network which is a branch of machine learning.
Deep learning is also having a connection to artificial neural networks. Hence we will try to understand artificial neural networks first in order to talk about deep learning.
Artificial neural networks
We have evidence that few scientists from various fields have discussed the possibility of building a human brain in the 1940s to 50s era. But people might have thought of making a machine that can mimic a human brain even before that. However, this new research area has been introduced as an academic field in 1956.
The research in neurology has found that the human brain is nothing but an electrical network of neurons. Each neuron of the network responds to an input voltage more than a threshold generating a pulse or does not at all. The input voltage is supplied by the output pulses of various other neurons connected. This theory is called All-or-nothing-law.
Inspiring from these results, scientists could build machines to imitate the human brain those days which are called Artificial neural networks these days. Today, we have software that can simulate neural networks instead of analog circuits built in the old days.
Assembling the building blocks
The building block of these networks is called the neuron. We ll have a look at a single neuron shown in the figure below.
The neuron is nothing but an implementation of All-or-nothing-law. There is n number of weighted inputs sum together. If the sum if more than a threshold value, the output is generated or otherwise not which is also called activation of the neuron.
A neural network can be formed by connecting neurons in layers. The output terminals of all the neurons in one layer are connected as inputs of each neuron in the next layer as shown in the following figure.
The initial layer or layer 0 is called the input layer. The final layer is called the output layer. Middle layers are called the hidden layers. There may be any number of hidden layers.
The basic idea behind neural networks
A single neuron is calculating its output based on the output of the neurons in the previous layer. This means the activation of a single neuron causes the activation of the next neurons along the path.
By adjusting the weights inside the neuron, the activation can be controlled. In this way, an infinite number of permutations can be generated by combining many neurons with various adjusted weights. Actually the whole network is calculating an output based on the input values. Hence the whole network is a function that takes an input and generates an output where the function can be changed by adjusting the weights of the neurons as wished.
When we need to use a neural network for performing a particular task, We need to adjust the network first. This is called training the network. We will discuss these concepts in detail later.
Time for deep learning
Deep learning is nothing but training large neural networks for performing various tasks. But the specialty is that the networks are containing more than one hidden layer.
As we discussed, the training process is simply adjusting the weights as required. Basically, Calculus is used for performing this task which needs computational power. When the number of layers is increased, the number of weights is also increased requiring more computational power.
This is why the earlier neural networks couldn’t have more than one layer. But today, computational power is not a big concern. We have powerful CPUs, GPUs even TPUs in clouds. Hence deep neural networks can be formed to achieve surprising results than traditional machine learning approaches could ever do. The reason is that more advanced functions could be generated with more layers involved. But we should remember that the number of examples required to train the network is huge.
Now we know what deep learning is all about. It is time to explore the topic deeper. The mathematics behind the scene should be explored to gain a better understanding.
We will have a look at linear algebra used to model deep neural networks, steps taken to create networks, programming languages, and frameworks used, best practices in future stories. We will also make a Tic-tac-toe game-playing deep neural network to get some hands-on experience along the journey.
Please use the comment section to discuss the topic and ask questions and give feedback. Don’t forget to give a clap if you enjoyed the article. Happy reading …!