Train Iris Data with MinibatchSource using CNTK and C#

Bahrudin Hrnjica

5.00/5 (2 votes)

Nov 13, 2017

CPOL

4 min read

11561

With this blog post, we are going to implement full C# program to train Iris data.

So far (post 1, post 2, post 3) we have seen what is CNTK, how to use it with Python, and how to create a simple C# .NET application and call basic CNTK methods. For this blog post, we are going to implement a full C# program to train Iris data.

The first step in using CNTK is how to get the data and feed the trainer. In the previous post, we prepared the Iris data in CNTK format, which is suitable when using MinibatchSource. In order to use the MinibatchSource, we need to create two streams:

one for the features, and
one for the label

Also features and label variables must be created using the streams as well, so that when accessing the data by using variables, the trainer is aware that the data is coming from the file.

Data Preparation

As mentioned above, we are going to use CNTK MinibatchSource to load the Iris data. The two files are prepared for this demo:

var dataFolder = "Data";
var dataPath = Path.Combine(dataFolder, "trainIris_cntk.txt");
var trainPath = Path.Combine(dataFolder, "testIris_cntk.txt");

var featureStreamName = "features";
var labelsStreamName = "label";

One file path contains the Iris data for the training, and the second path contains the data for testing, which will be used in the future post. Those two files will be arguments when creating minibatchSource for the training and validation respectively.

The first step in getting the data from the file is defining the stream configuration with proper information. That information will be used when the data would be extracted from the file. The configuration is completed by providing the number of features and the number of the one-hot vector component of the label in the file, as well as the names of features and labels. At the end of the blog post, the data is attached so the reader can see how data is prepared for the minibatchSource.

The following code defines the stream configuration for the Iris data set.

//stream configuration to distinct features and labels in the file
var streamConfig = new StreamConfiguration[]
  {
    new StreamConfiguration(featureStreamName, inputDim),
    new StreamConfiguration(labelsStreamName, numOutputClasses)
  };

Also, features and label variables must be created by providing the above the stream names.

//define input and output variable and connecting to the stream configuration
var feature = Variable.InputVariable(new  NDShape(1,inputDim), DataType.Float, featureStreamName);
var label = Variable.InputVariable(new NDShape(1, numOutputClasses), DataType.Float, labelsStreamName);

Now the input and the output variables are connected with the data from the file, and minibachSource can handle them.

Creating Feed Forward Neural Network Model

Once we defined the stream and variables, we can defined the network model. The CNTK is implemented so that you can defined any number of hidden layers with any activation function. For this demo, we are going to create a simple feed forward neural network with one hidden layer. The picture below shows the NN model.

In order to implement above NN model, we need to implement three methods:

static Function applyActivationFunction(Function layer, NNActivation actFun)

static Function simpleLayer(Function input, int outputDim, DeviceDescriptor device)

static Function createFFNN(Function input, int hiddenLayerCount, 
	int hiddenDim, int outputDim, NNActivation activation, string modelName, 
    DeviceDescriptor device)

The first method just applies specified activation function for the passed layer. The method is very simple and should look like:

static Function applyActivationFunction(Function layer, Activation actFun)
{
    switch (actFun)
    {
        default:
        case Activation.None:
            return layer;
        case Activation.ReLU:
            return CNTKLib.ReLU(layer);
        case Activation.Sigmoid:
            return CNTKLib.Sigmoid(layer);
        case Activation.Tanh:
            return CNTKLib.Tanh(layer);
    }
}
public enum Activation
{
    None,
    ReLU,
    Sigmoid,
    Tanh
}

The method takes the layer as argument and returns the layer with applied activation function.

The next method is creation of the simple layer with n weights and one bias. The method is shown in the following listing:

static Function simpleLayer(Function input, int outputDim, DeviceDescriptor device)
{
    //prepare default parameters values
    var glorotInit = CNTKLib.GlorotUniformInitializer(
            CNTKLib.DefaultParamInitScale,
            CNTKLib.SentinelValueForInferParamInitRank,
            CNTKLib.SentinelValueForInferParamInitRank, 1);

    //create weight and bias vectors
    var var = (Variable)input;
    var shape = new int[] { outputDim, var.Shape[0] };
    var weightParam = new Parameter(shape, DataType.Float, glorotInit, device, "w");
    var biasParam = new Parameter(new NDShape(1,outputDim), 0, device, "b");

    //construct W * X + b matrix
    return CNTKLib.Times(weightParam, input) + biasParam;
}

After initialization of the parameters, the Function object is created with number of output components and previous layer or the input variable. This is so called chain rule in NN layer creation. With this strategy, the user can create very complex NN model.

The last method perform layers creation. It is called from the main method, and can create arbitrary feed forward neural network, by providing the parameters.

static Function createFFNN(Variable input, int hiddenLayerCount, 
int hiddenDim, int outputDim, Activation activation, string modelName, DeviceDescriptor device)
{
    //First the parameters initialization must be performed
    var glorotInit = CNTKLib.GlorotUniformInitializer(
            CNTKLib.DefaultParamInitScale,
            CNTKLib.SentinelValueForInferParamInitRank,
            CNTKLib.SentinelValueForInferParamInitRank, 1);

    //hidden layers creation
    //first hidden layer
    Function h = simpleLayer(input, hiddenDim, device);
    h = ApplyActivationFunction(h, activation);
    //2,3, ... hidden layers
    for (int i = 1; i < hiddenLayerCount; i++)
    {
        h = simpleLayer(h, hiddenDim, device);
        h = ApplyActivationFunction(h, activation);
    }
    //the last action is creation of the output layer
    var r  = simpleLayer(h, outputDim, device);
    r.SetName(modelName);
    return r;
}

Now that we have implemented method for NN model creation, the next step would be a training implementation.

The training process is iterative where the minibachSource feeds the trainer for each iteration.
The Loss and the evaluation functions are calculated for each iteration, and shown in iteration progress. The iteration progress is defined by a separate method which looks like the following code listing:

private static void printTrainingProgress
(Trainer trainer, int minibatchIdx, int outputFrequencyInMinibatches)
{
    if ((minibatchIdx % outputFrequencyInMinibatches) == 0 && 
                      trainer.PreviousMinibatchSampleCount() != 0)
    {
        float trainLossValue = (float)trainer.PreviousMinibatchLossAverage();
        float evaluationValue = (float)trainer.PreviousMinibatchEvaluationAverage();
        Console.WriteLine($"Minibatch: {minibatchIdx} CrossEntropyLoss = 
                           {trainLossValue}, EvaluationCriterion = {evaluationValue}");
    }
}

During the iteration, the Loss function is constantly decreasing its value showing by indicating that the model is becoming better and better. Once the iteration process is completed, the model is shown in context of the accuracy of the training data.

Full Program Implementation

The following listing shows the complete source code implementation using CNTK for Iris data set training. At the beginning, several variables are defined in order to define structure of NN model: the number of input and output variables. Also, the main method implements the iteration process where the minibatchSource handling with the data by passing the relevant data to the trainer. More about it will be in a separate blog post. Once the iteration process is completed, the model result is shown and the program terminates.

public static void TrainIris(DeviceDescriptor device)
{
    var dataFolder = "";//files must be on the same folder as program
    var dataPath = Path.Combine(dataFolder, "iris_with_hot_vector.csv");
    var trainPath = Path.Combine(dataFolder, "iris_with_hot_vector_test.csv");

    var featureStreamName = "features";
    var labelsStreamName = "labels";

    //Network definition
    int inputDim = 4;
    int numOutputClasses = 3;
    int numHiddenLayers = 1;
    int hidenLayerDim = 6;
    uint sampleSize = 130;

    //stream configuration to distinct features and labels in the file
    var streamConfig = new StreamConfiguration[]
        {
            new StreamConfiguration(featureStreamName, inputDim),
            new StreamConfiguration(labelsStreamName, numOutputClasses)
        };

    // build a NN model
    //define input and output variable and connecting to the stream configuration
    var feature = Variable.InputVariable(new NDShape(1, inputDim), DataType.Float, featureStreamName);
    var label = Variable.InputVariable
                (new NDShape(1, numOutputClasses), DataType.Float, labelsStreamName);


    //Build simple Feed Froward Neural Network model
    // var ffnn_model = CreateMLPClassifier(device, numOutputClasses, 
    //                  hidenLayerDim, feature, classifierName);
    var ffnn_model = CreateFFNN(feature, numHiddenLayers, 
                     hidenLayerDim, numOutputClasses, Activation.Tanh, "IrisNNModel", device);

    //Loss and error functions definition
    var trainingLoss = CNTKLib.CrossEntropyWithSoftmax(new Variable(ffnn_model), label, "lossFunction");
    var classError = CNTKLib.ClassificationError(new Variable(ffnn_model), label, "classificationError");

    // prepare the training data
    var minibatchSource = MinibatchSource.TextFormatMinibatchSource(
        dataPath, streamConfig, MinibatchSource.InfinitelyRepeat, true);
    var featureStreamInfo = minibatchSource.StreamInfo(featureStreamName);
    var labelStreamInfo = minibatchSource.StreamInfo(labelsStreamName);

    // set learning rate for the network
    var learningRatePerSample = new TrainingParameterScheduleDouble(0.001125, 1);

    //define learners for the NN model
    var ll = Learner.SGDLearner(ffnn_model.Parameters(), learningRatePerSample);

    //define trainer based on ffnn_model, loss and error functions , and SGD learner 
    var trainer = Trainer.CreateTrainer(ffnn_model, trainingLoss, classError, new Learner[] { ll });

    //Preparation for the iterative learning process
    //used 800 epochs/iterations. Batch size will be the same as sample size since the data set is small
    int epochs = 800;
    int i = 0;
    while (epochs > -1)
    {
        var minibatchData = minibatchSource.GetNextMinibatch(sampleSize, device);
        //pass to the trainer the current batch separated by the features and label.
        var arguments = new Dictionary<Variable, MinibatchData>
        {
            { feature, minibatchData[featureStreamInfo] },
            { label, minibatchData[labelStreamInfo] }
        };

        trainer.TrainMinibatch(arguments, device);

        Helper.PrintTrainingProgress(trainer, i++, 50);

        // MinibatchSource is created with MinibatchSource.InfinitelyRepeat.
        // Batching will not end. Each time minibatchSource completes an sweep (epoch),
        // the last minibatch data will be marked as end of a sweep. We use this flag
        // to count number of epochs.
        if (Helper.MiniBatchDataIsSweepEnd(minibatchData.Values))
        {
            epochs--;
        }
    }
    //Summary of training
    double acc = Math.Round((1.0 - trainer.PreviousMinibatchEvaluationAverage()) * 100, 2);
    Console.WriteLine($"------TRAINING SUMMARY--------");
    Console.WriteLine($"The model trained with the accuracy {acc}%");

    //// validate the model
    // this will be posted as separate blog post
}

The full source code with formatted Iris data set for training can be found here.