Click here to Skip to main content
15,867,323 members
Articles / COM

Sentiment Analysis using ANNdotNET

Rate me:
Please Sign up or sign in to vote.
5.00/5 (3 votes)
17 Oct 2018CPOL4 min read 8.4K   9
Sentiment analysis using ANNdotNET

Full working example used in the article can be downloaded from here.

 

The October 2018 issue of MSDN magazine brings the article “Sentiment Analysis Using CNTK” written by James McCaffrey. I was wondering if I can implement this solution in ANNdotNET as Dr. McCaffrey written in the magazine. Indeed, I have implemented the complete solution in less than 5 minutes.

In this blog post, I am going to walk you through this very good and well written MSDN article example. I am not going to repeat the text written in the MSDN article, so it is recommended to read the article first, and back here and implement the example in ANNdotNET. Since the ANNdotNET is a GUI tool, it is interesting to see all great visualizations during the model training and evaluation. Also the ANNdotNET provides complete binary model evaluation by providing the confusion matrix, ROC Curve, and other binary performance parameters, this example makes more interesting and valuable to read.

The whole example is implemented in five steps.

Step 1: Prepare Files and Folder Structure

First, we need to create several folders and files in order to create an empty annproject. This manual creation of folders are necessary because ANNdotNET v1.0 has no option to create the empty project. This will be added in the next version.

So first, create the following set of hierarchically ordered folders:

  • SentimentAnalysis
    • MoveReview
      • data

The following figure shows this set of folders:

2018-10-15_21-08-04

Step 2: Download Data Sets Used in the Example

The only thing we need from the MSDN article is train and test data sets. The data can be downloaded from the MSDN sample: Code_McCaffreyTestRun1018.zip. Once the zip file is downloaded, unzip the sample, and copy files: imdb_sparse_train_50w.txt and indb_sparse_test_50w.txt to data folder as the image above shows.

Step 3: Create MoviewReview.ann and LSTM-Net.mlconfig Files

  • Open Notepad and create file with the following content:
XML
project:|Name:MovieReview |Type:NoRawData |MLConfigs:LSTM-Net
data:|RawData:MovieReview_rawdata.txt
parser:|RowSeparator:rn |ColumnSeparator: ; |Header:0 |SkipLines:0

Save file in SentimentAnalysis folder as MovieReview.ann. The following picture shows saved annproject file on disk.

2018-10-15_21-29-24

Now open Notepad again, create a new empty file. The empty file is supposed to be mlconfig file with the content shown below. Don’t worry about the content of the file, since all those details will be visible once we open it with ANNdotNET. If you want to know more about the structure of the mlconfig file, please refer to this wiki page of the ANNdotNET project.

XML
configid:msdn-oct-2018-issue-sentiment-analysis-article
metadata:|Column02:y;Category;Label;Random;0;1
features:|x 129892 1
labels:|y 2 0
network:|Layer:Embedding 50 0 0 None 0 0 |Layer:LSTM 25 25 0 TanH 1 1 |Layer:Dense 2 0 0 Softmax 0 0
learning:|Type:AdamLearner |LRate:0.01 |Momentum:0.85 |Loss:CrossEntropyWithSoftmax |Eval:ClassificationAccuracy |L1:0 |L2:0
training:|Type:Default |BatchSize:250 |Epochs:400 |Normalization:0 |RandomizeBatch:0 |SaveWhileTraining:0 |FullTrainingSetEval:1 |ProgressFrequency:1 |ContinueTraining:0 |TrainedModel:
paths:|Training:data\imdb_sparse_train_50w.txt |Validation:data\imdb_sparse_test_50w.txt |Test:data\imdb_sparse_test_50w.txt |TempModels:temp_models |Models:models|Result:LSTM-Net_result.csv |Logs:log

The file should be saved in the MovieReview folder with LSTM-Net.mlconfig file name. The next image shows where mlconfig file is stored.

2018-10-15_21-41-16

Step 4: Open annproject File with ANNdotNET GUI Tool

Now we have setup everything in order to open and train sentiment analysis example with ANNdotNET. Since ANNdotNET implements MLEngine which is based on CNTK, data sets are compatible and can be read by the trainer. In order to get better results, we have changed learning parameter a little bit. Instead of SGD, we used AdamLearner.

In case you don’t have ANNdotNET tool installed on your machine, just go to release section and download the latest version. Or clone the GitHub repository and run it within the Visual Studio. All information about how to run ANNdotNET as standalone application or as the Visual Studio solution can be found at GitHub page https://github.com/bhrnjica/anndotnet.

After simple unzipping binaries of the ANNdotNET on your machine, run it by simply selecting anndotnet.wnd.exe file. Once the ANNdotNET is running, click the Open application command and select the MoveReview.ann file. In a second, the application loads the project with the corresponding mlconfig file. From the project explorer, click on LSTM-NET three item, and similar content as image below should appear.

2018-10-15_21-54-34

Everything we have written into mlconfig file is now shown in the Network settings tab page.

  1. Input layer with 129892 dimensions
  2. Output layer with 2 dimensions (binary problem)
  3. Learning parameters:
    1. AdamLearner, with 0.01 lr and 0.85 momentum
    2. Loss Function is CrossEntropywithSoftmax
    3. Evaluation function is ClassificationAccuracy
  4. NNetwork Designer shows typical LSTM recurrent network

Step 5: Training and Evaluation of the Example

Now that we reviewed the network settings, we can switch to the train tab page, and review the training parameters. Since we already setup training parameters in the mlconfig file, we don’t need to change anything.

Start training process by clicking on the Run application command. After some time, we should see the following result:

2018-10-16_16-44-28

If we switch to Evaluation page, we can perform some statistics analysis in order to evaluate if the model is good or not. Once the evaluation tab page is shown, click on Refresh button to evaluate the model against training and validation data stets.

2018-10-16_16-44-39

The left statistics are for the training dataset, and the left side is for the validation data set. As can be seen, the model perfectly predicted all data from the training data set, and about 70% of accuracy described the validation data set. Of course, the model is not good as we expected for the production, but for this demonstration is good enough. There are also two buttons to show ROC curve, and other binary performance parameters, for both data sets, which the reader may test.

That’s all that is needed in order to have complete Sentiment Analysis example setup and running. In case you want complete ANNdotNET project, it can be downloaded from here.

History

  • 17th October, 2018: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Bosnia and Herzegovina Bosnia and Herzegovina
Bahrudin Hrnjica holds a Ph.D. degree in Technical Science/Engineering from University in Bihać.
Besides teaching at University, he is in the software industry for more than two decades, focusing on development technologies e.g. .NET, Visual Studio, Desktop/Web/Cloud solutions.

He works on the development and application of different ML algorithms. In the development of ML-oriented solutions and modeling, he has more than 10 years of experience. His field of interest is also the development of predictive models with the ML.NET and Keras, but also actively develop two ML-based .NET open source projects: GPdotNET-genetic programming tool and ANNdotNET - deep learning tool on .NET platform. He works in multidisciplinary teams with the mission of optimizing and selecting the ML algorithms to build ML models.

He is the author of several books, and many online articles, writes a blog at http://bhrnjica.net, regularly holds lectures at local and regional conferences, User groups and Code Camp gatherings, and is also the founder of the Bihac Developer Meetup Group. Microsoft recognizes his work and awarded him with the prestigious Microsoft MVP title for the first time in 2011, which he still holds today.

Comments and Discussions

 
PraiseYea! Finally got some good training results! Pin
asiwel20-Oct-18 9:22
professionalasiwel20-Oct-18 9:22 
QuestionSecond problem: Alert box says no AdamLearner Pin
asiwel19-Oct-18 9:34
professionalasiwel19-Oct-18 9:34 
AnswerRe: Second problem: Alert box says no AdamLearner Pin
Bahrudin Hrnjica19-Oct-18 21:46
professionalBahrudin Hrnjica19-Oct-18 21:46 
GeneralRe: Second problem: Alert box says no AdamLearner Pin
asiwel20-Oct-18 9:04
professionalasiwel20-Oct-18 9:04 
GeneralRe: Second problem: Alert box says no AdamLearner Pin
Bahrudin Hrnjica20-Oct-18 9:13
professionalBahrudin Hrnjica20-Oct-18 9:13 
Thank you for great feedback.
Yeah there are lot of details which could be better implemented or improved, but I was focused on main features.

Currently ANNdotNET is running with MLEngine which is based on CNTK 2.5.1. I tried to update to CNTK 2.6, which brings .NET Core and more new features, but unfortunately there is a mayor issue which I cannot realize if it is up to my code or CNTK. So I have decided to stay on CNTK 2.5.1 for now. Hope the issue will be solved until ANNdotNET next version.
GeneralRe: Second problem: Alert box says no AdamLearner Pin
Bahrudin Hrnjica22-Oct-18 20:39
professionalBahrudin Hrnjica22-Oct-18 20:39 
QuestionA problem with the demo mlconfig file? Pin
asiwel18-Oct-18 21:20
professionalasiwel18-Oct-18 21:20 
AnswerRe: A problem with the demo mlconfig file? Pin
Bahrudin Hrnjica19-Oct-18 4:58
professionalBahrudin Hrnjica19-Oct-18 4:58 
GeneralRe: A problem with the demo mlconfig file? Pin
asiwel19-Oct-18 9:00
professionalasiwel19-Oct-18 9:00 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.