TensorFlow Tutorial 4: Creating a Neural Net

CodeProject

2.00/5 (1 vote)

Oct 16, 2018

CPOL

3 min read

7133

By Dante Sblendorio

What is a neural net, and how can you create one? Keep reading to find out. Below, I explain the basics of setting up a neural net using TensorFlow.

A neural net consists of three key components: the input layer, the hidden layer(s), and the output layer. Each layer has an arbitrary number of nodes (or neurons). In the example in the previous section, the input layer is const1 and const2. The matrix addition can be thought of as the hidden layer, and the output layer is the output. In the case of our wine data, the input data is the 13 chemical features and the output layer is the Class. The hidden layer can be thought of as a sophisticated ensemble of mathematical functions that behave as filters, extracting the relevant features for determining the correct Class. The structure of the neural net is inspired by biology, specifically the neural connections in the human brain. For the sake of this article, I won’t go into depth about the mathematical structure of the hidden layer(s). It is sufficient to think of it as a mathematical “black box” that extracts hidden meaning from the data. (However, if you want to learn more, this is a thorough online textbook on neural networks and deep learning.)

We split the dataset into a training and test set. This allows us to “train” the mathematical operators within the hidden layer to converge on ideal values that correctly predict the correct Class based on the 13 features for each observation. We then inject the test set into the neural net and evaluate the accuracy to determine how well the net has been trained. Splitting the data in this way provides a way to avoid overfitting or underfitting the data, thereby giving a true estimation of the accuracy of the net. In order to prepare the data for TensorFlow, we perform some slight manipulation:

from sklearn.model_selection import train_test_split

#this prepares our Class variable for the NN
def convertClass(val):
   if val == 1:
       return [1, 0, 0]
   elif val == 2:
       return [0, 1, 0]
   else:
       return [0, 0, 1]
   
Y = wine_df.loc[:,'Class'].values
Y = np.array([convertClass(i) for i in Y])
X = wine_df.loc[:,'Alcohol':'Proline'].values

#we split the dataset into a test and training set
train_x, test_x, train_y, test_y = train_test_split(X,Y , test_size=0.3, random_state=0)
train_x = train_x.transpose()
train_y = train_y.transpose()
test_x = test_x.transpose()
test_y = test_y.transpose()

Now that the data is prepped, we define several functions that form the foundation of the neural net. First, we define a function that establishes the initial parameters of our net. Here we also define how many nodes are in each of the hidden layers (I chose to have two hidden layers). Since we have three possible values for Class, we have three nodes in the output layer. Next we define a forward propagation function. All this does is send the 13 features through the net, and subject them to the mathematical operations within the hidden layer.

We also need to define a cost function. This is a critical function that allows us to “train” the network. It is a single value that describes how well the net does at predicting the correct Class. If the cost value is high, the mathematical operates adjust, and the data is fed through the net again. The data is fed through the net until the cost value converges.

def init_parameters(num_input_features):

   num_hidden_layer =  32
   num_hidden_layer_2 = 16
   num_output_layer_1 = 3
   
   tf.set_random_seed(1)
   W1 = tf.get_variable("W1", [num_hidden_layer, num_input_features], initializer = tf.contrib.layers.xavier_initializer(seed=1))
   b1 = tf.get_variable("b1", [num_hidden_layer, 1], initializer = tf.zeros_initializer())
   W2 = tf.get_variable("W2", [num_hidden_layer_2, num_hidden_layer], initializer = tf.contrib.layers.xavier_initializer(seed=1))
   b2 = tf.get_variable("b2", [num_hidden_layer_2, 1], initializer = tf.zeros_initializer())
   W3 = tf.get_variable("W3", [num_output_layer_1, num_hidden_layer_2], initializer = tf.contrib.layers.xavier_initializer(seed=1))
   b3 = tf.get_variable("b3", [num_output_layer_1, 1], initializer = tf.zeros_initializer())
   
   parameters = {"W1": W1,
                 "b1": b1,
                 "W2": W2,
                 "b2": b2,
                 "W3": W3,
                 "b3": b3}
   
   return parameters
			
def for_prop(X, parameters):
			
   W1 = parameters['W1']
   b1 = parameters['b1']
   W2 = parameters['W2']
   b2 = parameters['b2']
   W3 = parameters['W3']
   b3 = parameters['b3']
   
   # propagates values through NN using Rectified Linear Unit as activation function          
   Z1 = tf.add(tf.matmul(W1, X), b1)                     
   A1 = tf.nn.relu(Z1)                                    
   Z2 = tf.add(tf.matmul(W2, A1), b2)                     
   A2 = tf.nn.relu(Z2)                                   
   Z3 = tf.add(tf.matmul(W3, A2), b3)                    
   return Z3

def c(Z3, Y):
   logits = tf.transpose(Z3)
   labels = tf.transpose(Y)
   cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels))
   return cost

We also need to define a function that produces a random subset of observations within the training set. I mentioned previously that data is fed into the neural net until the cost value converges; each iteration a random subsample is picked out of the total training set and injected into the net. In this function we create batches of subsamples:

def rand_batches(X, Y, batch_size = 128, seed = 0):
   m = X.shape[1]
   batches = []
   np.random.seed(seed)
   
   # shuffle
   permutation = list(np.random.permutation(m))
   shuffled_X = X[:, permutation]
   shuffled_Y = Y[:, permutation].reshape((Y.shape[0],m))

   # partition the shuffled data
   num_batches = math.floor(m/batch_size)
   for k in range(0, num_batches):
       batch_X = shuffled_X[:, k * batch_size : k * batch_size + batch_size]
       batch_Y = shuffled_Y[:, k * batch_size : k * batch_size + batch_size]
       batch = (batch_X, batch_Y)
       batches.append(batch)
   
   # handle end case
   if m % batch_size != 0:
       batch_X = shuffled_X[:, num_batches * batch_size : m]
       batch_Y = shuffled_Y[:, num_batches * batch_size : m]
       batch = (batch_X, batch_Y)
       batches.append(batch)
   
   return batches

To generate your entry code for challenge 4, paste the following code in your Jupyter notebook:

member_number = 12345678

one = [member_number, int(member_number/5), int(member_number/100)]

two = [0.02, 0.05, 0.08]

a = tf.placeholder(tf.float32, shape=(3))

b = tf.placeholder(tf.float32, shape=(3))

result = tf.tensordot(a, b, 1)

with tf.Session() as sess:

   print(int(result.eval(feed_dict={a: one, b: two})))

And replace 12345678 with your CodeProject member number before you run the code. The number that is printed will be your entry code for this challenge. Please click here to enter the contest entry code.