TensorFlow for Deep Learning
DNNs today is the buzzword in the AI community. Many data science/Kaggle competitions have been recently won by candidates using DNNs. While the concept of DNNs had been around since the proposal of Perceptrons by Rosenblat in 1962 and they were made feasible by the discovery of the Gradient Descent Algorithm in 1986 by Rumelhart, Hinton, and Williams. It is only recently that DNNs became the favourite of AI/ML enthusiasts and engineers world over.
The main reason for this is the availability of modern computing power such as GPUs and tools like TensorFlow that make it easier to access GPUs and construct complex neural networks in just a few lines of code.
As a machine learning enthusiast, you must already be familiar with the concepts of neural networks and deep learning, but for the sake of completeness, we will introduce the basics here and explore what features of TensorFlow make it a popular choice for deep learning.
Neural networks are a biologically inspired model for computation and learning. Like a biological neuron, they take weighted input from other cells (neurons or environment); this weighted input undergoes a processing element and results in an output which can be binary (fire or not fire) or continuous (probability, prediction). Artificial Neural Networks (ANNs) are networks of these neurons, which can be randomly distributed or arranged in a layered structure. These neurons learn through the set of weights and biases associated with them.
The following figure gives a good idea about the similarity of the neural network in biology and an artificial neural network:
Deep learning, as defined by Hinton et al. (https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf), consists of computational models composed of multiple processing layers (hidden layers). An increase in the number of layers results in an increase in learning time. The learning time further increases due to a large dataset, as is the norm of present-day CNN or Generative Adversarial Networks (GANs). Thus, to practically implement DNNs, we require high computation power. The advent of GPUs by NVDIA® made it feasible and then TensorFlow by Google made it possible to implement complex DNN structures without going into the complex mathematical details, and availability of large datasets provided the necessary food for DNNs. TensorFlow is the most popular library for deep learning for the following reasons:
- TensorFlow is a powerful library for performing large-scale numerical computations like matrix multiplication or auto-differentiation. These two computations are necessary to implement and train DNNs.
- TensorFlow uses C/C++ at the backend, which makes it computationally fast.
- TensorFlow has a high-level machine learning API (tf.contrib.learn) that makes it easier to configure, train, and evaluate a large number of machine learning models.
- One can use Keras, a high-level deep learning library, on top of TensorFlow. Keras is very user-friendly and allows easy and fast prototyping. It supports various DNNs like RNNs, CNNs, and even a combination of the two.