Hey Everyone in my last post I showed how we can write a simple neural network program in python from scratch, just to have a better understanding of how they actually work under the hood. If you haven’t checked it yet please click here to check that out first. Okay so in this post I prepared a complete basics yet well-informed tensorflow tutorial, where we will see how we can use Tensorflow to write the network we did in our previous post. So let’s get started.

*I**nfo: Join This newly created Slack Community for everyone to discuss about ML Computer Vision and Artificial Intelligence *

## Setting up Tensorflow

Tensorflow website has a complete tutorial guide to install tensorflow. which is available for linux, macOSX and Windows

check this link for that **www.tensorflow.org/install/**

In all my previous post I tried to write my posts in a way that it can be done in any OS like Linux Mac or Windows. but unfortunately you might face some issue with windows so I would suggest either * dual-boot to ubuntu *or

*,*

**use Virtual box to create virtual machine and load ubuntu in it****Update: I have created A video to install Tensorflow. and this will work on all the platforms **

## Hello Codacus

To test that we have successfully installed tensorflow lets run this code to see if its working perfectly

```
import tensorflow as tf
hello = tf.constant('Hello, Codacus!')
sess = tf.Session()
print(sess.run(hello))
```

If there is no error, you should get * “Hello, Codacus”* in the output

## TensorFlow Basic Concepts

There are the main concept in tensorflow that we need to understand.

- placeholders
- variables
- operations
- session

these are the three things we will be dealing with when working with tensorflow. lets discuss about these one by one individually

### Placeholders

Placeholders are the terminals/data point through which we will be feeding data into the network we will build. Its like gate point for our input and output data.

lets create some placeholders

```
inputs=tf.placeholder('float',[None,2],name='Input')
targets=tf.placeholder('float',name='Target')
```

So we created two placeholders the data type that will feed to the placeholder is * float *, 2nd parameter is the shape of input. In tensorflow data flows through the network in form of matrix.

so lets say we have 3 input neurons.

then our input data should be a list of 3 values for example [1,2,3]

Now lets say we have 4 sets of inputs for example

```
[1,2,3]
[2,4,3]
[2,8,6]
[5,4,3]
```

and we want to feed all the three inputs in a batch the shape on the input will be **[4,3]**

Now lets say we don’t know how many input set we are going to feed at the same time. it can be 1 it can be 100 sets at a time. So we specify that with * None*.

if we set the input shape of the placeholder as [None,3] then it will be able to take any sets of data in a single shot. that is what we did in out earlier code.

In the second case I didn’t select any shape, in this case it will accept any shape. but there is a chance of getting error in runtime if the data the network is expecting has a different shape than the data we provided in the placeholder.

The third parameter I used is * name * it is not so important for the network to work. but it will be useful to give a name to every node of the network. This will help up to identify the nodes when we will be checking the network diagram in

**Tensor board**### Variables

Variables in tensorflow are different from regular variables, unlike placeholders * we will never set any specific values in these, nor use it for storing any data*. We will be using it to make connections between layers. and

*during training. So these are variable for tensorflow to play with. These are often called trainable parameters.*

**tensorflow itself will use these variables to tune the network**Lets create some Variables

```
weight1=tf.Variable(tf.random_normal(shape=[2,3],stddev=0.02),name="Weight1")
biases1=tf.Variable(tf.random_normal(shape=[3],stddev=0.02),name="Biases1")
```

So we have two variables here. the first parameter in here is the values we want it initialize it with. in this case we initialized the first one as 2×3 matrix with random values with variation of +/- 0.02. for the shape the* first one is the number of input connection *that the layer will receive. and the

*that the layer will produce for the next layer.*

**2nd one is the number of output connection**here is an example

For the bias I only selected 3 as shape because for the first bias neurons there is no input and it has 3 output connections.

### Operations

Operations are the part where data get transferred from one layer to the next layer. using the weight variables we create which is basically the connection weights between neurons of two consecutive layers

In our previous post we calculated the output of each neuron by summing the weighed output of all the previous layer neuron. we were doing that for each neuron in the layer using a for loop. But there is a better way of doing that. for example we have 2 input neuron and 3 hidden neuron. So there will be total 2×3 number of connections in between. we are presenting them in a 2d matrix

If we do matrix multiplication we will get the below results

so these are the same calculation we did in our previous post for each output using a loop to calculate the sum. Now to add the bias we can directly sum the bias with the results

this is the final form of the output with bias.. okay let’s do this in tensorflow. To do this we need to do operations that is matrix multiplications

```
hLayer=tf.matmul(inputs,weight1)
hLayer=hLayer+biases1
```

So here is two matrix operations first one is matrix multiplications of inputs and the weights1 matrix which will produce the output without bias. and in the next line we did matrix addition which basically performed an element by element additions.

Now this is what we call operations in tensorflow. It’s actually flowing the data from one layer.

one last thing is still pending that is applying the activation functions in the layer outputs. so lets apply that

`hLayer=tf.sigmoid(hLayer, name='hActivation')`

Okay so we completed till hidden layer lets create the output layer also and finish the network

```
weight2=tf.Variable(tf.random_normal(shape=[3,1],stddev=0.02),name="Weight2")
biases2=tf.Variable(tf.random_normal(shape=[1],stddev=0.02),name="Biases2")
```

As we can see the output layer has only 1 neuron and the previous hidden layer had 3 neurons. So the weight matrix has 3 input and 1 output thus the shape is [3,1]. and the bias has only one neuron to connect to thus bias shape in [1]

lets create the output layer

```
output=tf.matmul(hLayer,weight2)
output=output+biases2
output=tf.sigmoid(output, name='outActivation')
```

with this our network is complete.

##### Optimization

We still have’t done anything to calculate the error or any optimization methods to train the network

```
#cost=tf.nn.softmax_cross_entropy_with_logits(logits=output,labels=targets)
# update tf.nn.softmax_cross_entropy_with_logits is for classification problems only
# we will be using tf.squared_difference()
cost=tf.squared_difference(targets, output)
cost=tf.reduce_mean(cost)
optimizer=tf.train.AdamOptimizer().minimize(cost)
```

The method * softmax_cross_entropy_with_logits *will take the output of the neuron and target placeholder where we will put out expected outputs. then it will calculate the error for each and every neuron in the layer.

* Update:* I changes the cost function to

*which is similar to the same squared difference, and best suitable for this problem which is not a classification problem,*

**tf.squared_difference()***takes two input fitsr is the target value and second is the predicted value.*

**tf.squared_difference**next we took that error in a variable * cost* and put operated

*function on it and stored it in*

**reduce_mean***variable again. the*

**cost***is taking that array of errors and calculating the average of them and returning a single value for the error.*

**reduce_mean**finally we are creating the optimizer to called * AdamOptimizer *to minimize the error

*It’s basically do the same back propagation using the cost as starting error and propagate that error backwards while fixing the weights a little bit every time. Almost similar that we did in our own neural network in the last post.*

**cost .**### Session

Till now what we did is, we described tensorflow what is the structure of the network and how data will flow. but the network is still not running and no operation has been done yet. So to run the network we need to start a * tensorflow session. * any actual calculation will happen inside a session what ever tensorflow operation we perform will actually happen after we start the session and run it.

So before starting the session lets create the data set first. I am going to make it work as another binary gate XOR gate

of course we can do much complicated stuff but for the sake of simplicity we are going to make it work like an XOR gate. So lets generate the input outputs for this

```
#generating inputs
import numpy as np
inp=[[0,0],[0,1],[1,0],[1,1]]
out=[[0],[1],[1],[0]]
inp=np.array(inp)
out=np.array(out)
```

So before that we need our inputs. and we need it in numpy array format so we converted it to numpy array.

Now lets start the session

```
epochs=4000 # number of time we want to repeat
with tf.Session() as sess:
tf.global_variables_initializer().run()
for i in range(epochs):
error,_ =sess.run([cost,optimizer],feed_dict={inputs: inp,targets:out})
print(i,error)
```

So here we started a session and named it * sess . *After we setup the session the first thing we need to do is we need to initialize all the tensorflow variables that we created using

**tf.global_variables_initializer().run() .**Now we started a for loop to repeat our training. now inside the loop we are using * sess.run() *. The first parameter in this is a list of operations that we want to do. In this case I want to know the error for each training and I also want to Optimize the network. So I put cost and the optimizer variable in the list.

Te second parameter is the * feed_dict *. In this argument we will tell tensorflow from where it will take the data for all its

*. So we have 2*

**placeholders***in our network*

**placeholders***and*

**inputs***we have to tell tensorflow from where it should take its data. We put that information in the*

**targets***parameter.*

**feed_dict**for * inputs *we set

*. and for*

**inp***we set*

**targets***Therefore we set*

**out .**

**feed_dict ={inputs:inp , targets:out}**and now the output of * sess.run()* is the output of the individual operation that we put inside the list. that is the

*and the*

**cost***. the*

**optimizer***returns the average*

**cost***and the*

**error***doesn’t return any value so we set it as*

**optimizer***to ignore it .*

**_**and finally we are printing those values.

## Lets Test the model

We have trained the model now lets test its performance

```
with tf.Session() as sess:
tf.global_variables_initializer().run()
for i in range(epochs):
error,_ =sess.run([cost,optimizer],feed_dict={inputs: inp,targets:out})
print(i,error)
while True:
a = input("type 1st input :")
b = input("type 2nd input :")
inp=[[a,b]]
inp=np.array(inp)
prediction=sess.run([output],feed_dict={inputs: inp})
print(prediction)
```

So here we go. after training we can test the network by running the output node and supplying data to the inputs placeholder and finally printing it

## Saving The Session for later

One thing to remember is all the variable that were tuned in the session is only remain in that state till the session is active. one we close the session the state of the network is lost. To avoid this we need the save the session. and reload it every time we want to use it again.

to save the session we can use tf.train.Saver() to create a saver object

```
saver = tf.train.Saver()
with tf.Session() as sess:
tf.global_variables_initializer().run()
for i in range(epochs):
error,_ =sess.run([cost,optimizer],feed_dict={inputs: inp,targets:out})
print(i,error)
saver.save(sess, "model.ckpt") # saving the session with file name "model.ckpt"
```

At the end of the for loop after the training we are saving the session with a file name “model.ckpt”. so now we can load it for later use.

## Restoring/Loading the session

Once we have a saved session we can now restore it easily by calling the restore method of Saver class object

```
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, "model.ckpt")
while True:
a = input("type 1st input :")
b = input("type 2nd input :")
inp=[[a,b]]
inp=np.array(inp)
prediction=sess.run([output],feed_dict={inputs: inp})
print(prediction)
```

So This is how we can restore a previously trained model or later use. Thanks for reading I hope this post we help you to understand the basics of tensorflow.

In the next post we will discuss how to use Tensorboard to visualize the network we design in a graphical representation and see the data flowing in real-time

Patryk says

Thats weird, but after compiling, program gave me that output:

(0, 0.2500003)

(1, 0.25)

(2, 0.25000018)

(3, 0.25000015)

(4, 0.25000003)

(5, 0.25)

(6, 0.25000006)

(7, 0.25000012)

(8, 0.25000006)

(9, 0.25)

type 1st input :0

type 2nd input :0

[array([[ 0.49992973]], dtype=float32)]

type 1st input :1

type 2nd input :1

[array([[ 0.49995828]], dtype=float32)]

Anirban Kar says

in this tutorial I selected the epochs=10.. which is very less. try something more than 1000.then the error will go down more

Patryk says

Great, thank for reply 🙂

Anirban Kar says

initially till 1k epochs there is no significant changes

this is the image in tensorboard

https://uploads.disquscdn.com/images/4c1faf8917dc93059c6b5973a6f4fc5c24ed02837dc3240b2d5d4fada93ef06a.png

Nitin Bansal says

Hey Anirban! Thank You for great write up.

I had a small doubt in one part. I was looking to implement a part where I change the value of a variable, after certain number of epochs. In my scenario, I am trying to implement a l2 regularizer, whose weight decay constant changes with epochs(supposing it doubles after each passing 10 epochs). How do I actually go about implementing this. An example code would be great!

Thanks again for your effort.

Nitin

Anirban Kar says

#use this to create the regularizer

regularizers = tf.nn.l2_loss(weight1) + tf.nn.l2_loss(weight2) # apply on all the weight matrix

beta = 0.01

cost = tf.reduce_mean(cost + beta * regularizers)

#now you apply the cost to the optimizer it will automatically reduce the weights when training

Yacoub Tah says

Thank you so much, really great post!