When we talk about machine learning in python, you might have heard of the term “Tensorflow”. That would lead you to immediately ask the question, “What is Tensorflow ?”. On a very high level, it is a library used to simplify mathematical computations. But it actually has a lot of applications and in this article, I will try and explain what Tensorflow is and how it is used in machine learning. I will also cover some of the concepts and jargon associated with it.
What is Tensorflow?
Tensorflow, first of all, is an open source mathematical library. Google created this as a standard platform for developers, students, scientists or researchers. All of them can share their findings or share what they have built easily.
But saying that this is just a mathematical library is an understatement. It is an extremely powerful library capable of working with Both CPUs and GPUs. Deep Neural networks are designed, trained and run using this library. Before we proceed, you can watch this intro video by Google.
Tensorflow works by creating a graph. A graph has a set of nodes and edges that run between the nodes. The image below shows an example graph in Tensorflow. Let’s not bother about what it represents now. But what you need to know is that the ovals and squares are nodes and the arrows are the edges. As you can see from the graphs, you can group nodes to form various levels of mathematical computations.
In every Tensorflow graph, the nodes represent some mathematical function. That is, any given node in a graph takes in some data in a specific format and then gives an output. The edges in the graph represent the data itself which is being passed ( or flows) to a particular node. This is the reason Tensorflow is also called a “Flow Graph Library”.
The edges on Tensorflow graphs are always multidimensional data arrays and they are called as Tensors. So, every node in tensorflow accepts a tensor and applies the mathematical functions on it. It then passes the output (which is also a tensor) to the next node. Very simply put, Tensors are a multi-dimensional array of numbers.
Numpy is another library that python developers use for mathematical computations. If you know what numpy is, then you can relate to a tensor as a numpy array. They are almost similar if not the same.
Categorical and Continuous Data
There are two major types of data that you might have to work with. consider the following example of some employee data.
From the table above, we can state that columns like Gender and Nationality belong to some fixed finite set of data. The values for gender can only be either “Male”, “Female” or “Others”. The values for Nationality can be one of the countries in the world like “India”, “USA”, “Canada”, etc. Such types of data are called categorical data or discrete data. For categorical data, the values cannot be outside the fixed set.
However, the columns like age and salary do not have a fixed set of values. Age can have values like 34.5 as well ( even though if it’s not used). It is the same with the salary. Salary can be any number between 1 and infinity ( hypothetical, I know !). We call these types of data as Continuous data. Both these types of data are required while performing machine learning tasks.
Let us say that you would like to predict the salary of a person given the age and nationality. As you can see, you will need both types of data ( categorical and discrete) to make the prediction. Tensorflow has different methods to handle them separately for the same task.
Every tensor has a dimension attached to it. The number of dimensions is referred to as the rank of the tensor. So, when we say, a tensor is a multidimensional array, it might look similar to the example below.
3 # this is a rank 0 tensor. Just a number. It is a scalar with shape  [1. ,2., 3.] # this one is a rank 1 tensor; this is a vector with shape  [[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3] [[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3]
They are also called the computational graphs. These are nothing but a sequence of connected nodes. Each node performs a specific operation.
In terms of this graph, Tensorflow programs consist of two sections.
- Graph creation – This involves specifying what each node should do.
- Graph execution – This involves running the graph in a session.
Tensorflow and Scikit-Learn
Scikit-Learn is a very popular machine learning library for python. It has a large set of machine learning algorithms which have been implemented. Most of the python developers might have at least heard of it if not worked on it. There might be some of you wondering if Tensorflow is similar to Scikit-learn. The answer to that is ” it is similar, but only in someway”. Tensorflow can do almost all the things that Scikit-learn does. However, there are a lot many number of things that Tensorflow can do but Scikit cannot.
Tensorflow is used extensively for deep learning and that is the main reason it has gained popularity.
How to get started with Tensorflow
- Install Tensorflow with python support. In case you have an Nvidia GPU, then make sure you install Tensorflow with GPU support as well.
- You can try out Tensorflow’s Basic Usage guide.
I hope that I have answered your question “what is Tensoflow” to an extent that you are now more confident about its concepts. If you feel that these resources mentioned above are difficult, I recommend you to subscribe to my blog by providing your email below. You will receive simplified Tensorflow tutorials to perform machine learning tasks straight to your mail box.