Click here to Skip to main content
15,867,686 members
Please Sign up or sign in to vote.
3.00/5 (2 votes)
See more:
All the examples I find use a foreach loop. Prior to adding the other 2 classes I used a foreach loop fine, but now I got to get the indexing to work right. Any suggestions?

C#
static void Main(string[] args)
        {
            DataTable table = new DataTable("IrisTrainingData");
            table.Columns.Add("SepalLength", typeof(double));
            table.Columns.Add("SepalWidth", typeof(double));
            table.Columns.Add("PedalLength", typeof(double));
            table.Columns.Add("PedalWidth", typeof(double));
            table.Columns.Add("Class", typeof(double));

            //training data
            table.Rows.Add(5.1, 3.5, 1.4, 0.2, 1);
            table.Rows.Add(4.9, 3.0, 1.4, 0.2, 1);
            table.Rows.Add(4.7, 3.2, 1.3, 0.2, 1);
            table.Rows.Add(4.6, 3.1, 1.5, 0.2, 1);
            table.Rows.Add(5.0, 3.6, 1.4, 0.2, 1);
            table.Rows.Add(5.4, 3.9, 1.7, 0.4, 1);
            table.Rows.Add(4.6, 3.4, 1.4, 0.3, 1);
            table.Rows.Add(5.0, 3.4, 1.5, 0.2, 1);
            table.Rows.Add(4.4, 2.9, 1.4, 0.2, 1);
            table.Rows.Add(4.9, 3.1, 1.5, 0.1, 1);
            table.Rows.Add(5.4, 3.7, 1.5, 0.2, 1);
            table.Rows.Add(4.8, 3.4, 1.6, 0.2, 1);
            table.Rows.Add(4.8, 3.0, 1.4, 0.1, 1);
            table.Rows.Add(4.3, 3.0, 1.1, 0.1, 1);
            table.Rows.Add(5.8, 4.0, 1.2, 0.2, 1);
            table.Rows.Add(5.7, 4.4, 1.5, 0.4, 1);
            table.Rows.Add(5.4, 3.9, 1.3, 0.4, 1);
            table.Rows.Add(5.1, 3.5, 1.4, 0.3, 1);
            table.Rows.Add(5.7, 3.8, 1.7, 0.3, 1);
            table.Rows.Add(5.1, 3.8, 1.5, 0.3, 1);
            table.Rows.Add(5.4, 3.4, 1.7, 0.2, 1);
            table.Rows.Add(5.1, 3.7, 1.5, 0.4, 1);
            table.Rows.Add(4.6, 3.6, 1.0, 0.2, 1);
            table.Rows.Add(5.1, 3.3, 1.7, 0.5, 1);
            table.Rows.Add(4.8, 3.4, 1.9, 0.2, 1);


            // Variables to total the values of the four columns
            double total1 = 0.0;
            double total2=0.0;
            double total3=0.0;
            double total4=0.0;
              
                      

           //Calculate the sum of training data
           for (int i = 0; i < 25; i++)
           {
               total1 += table.Rows[i]["SepalLength"]; <- I know this is wrong

               total2 += (double)row["SepalWidth"];
               total3 += (double)row["PedalLength"];
               total4 += (double)row["PedalWidth"];
           }
Posted
Updated 1-May-14 9:19am
v2
Comments
What's wrong here?

I played with this dataset in Weka for clustering pretty much:)

A linq solution;

C#
var tableFiltered = table.AsEnumerable().Where(p => p.Field<double>("Class") == 1);
double total1 = tableFiltered.Sum(p => p.Field<double>("SepalLength"));
double total2 = tableFiltered.Sum(p => p.Field<double>("SepalWidth"));
double total3 = tableFiltered.Sum(p => p.Field<double>("PedalLength"));
double total4 = tableFiltered.Sum(p => p.Field<double>("PedalWidth"));


To calculate variance, mean, standart deviation you can add this extension methods class to your project;

C#
public static class MyListExtensions
    {
        public static double Mean(this List<double> values)
        {
            return values.Count == 0 ? 0 : values.Mean(0, values.Count);
        }

        public static double Mean(this List<double> values, int start, int end)
        {
            double s = 0;

            for (int i = start; i < end; i++)
            {
                s += values[i];
            }

            return s / (end - start);
        }

        public static double Variance(this List<double> values)
        {
            return values.Variance(values.Mean(), 0, values.Count);
        }

        public static double Variance(this List<double> values, double mean)
        {
            return values.Variance(mean, 0, values.Count);
        }

        public static double Variance(this List<double> values, double mean, int start, int end)
        {
            double variance = 0;

            for (int i = start; i < end; i++)
            {
                variance += Math.Pow((values[i] - mean), 2);
            }

            int n = end - start;
            if (start > 0) n -= 1;

            return variance / (n);
        }

        public static double StandardDeviation(this List<double> values)
        {
            return values.Count == 0 ? 0 : values.StandardDeviation(0, values.Count);
        }

        public static double StandardDeviation(this List<double> values, int start, int end)
        {
            double mean = values.Mean(start, end);
            double variance = values.Variance(mean, start, end);

            return Math.Sqrt(variance);
        }
    }


Then you can simply find a variation of one column like that;

C#
var tableFiltered = table.AsEnumerable().Where(p => p.Field<double>("Class") == 1);
double varSepalLen = tableFiltered.Select(p => p.Field<double>("SepalLength")).ToList().Variance();
 
Share this answer
 
v3
Comments
Member 10659035 1-May-14 18:24pm    
Thanks, will try this.
Member 10659035 1-May-14 19:22pm    
It works, and yea, I'm trying to solve a clustering algorithm problem. Can you use this code to calculate variance as well? I got the mean values, but want to calculate the variance as well.
Emre Ataseven 1-May-14 19:33pm    
Updating answer.
Member 10659035 1-May-14 19:58pm    
Thanks, this will help me a lot.
C#
Double nrClasses =  Convert.ToDouble(table.Compute("max(Class)", string.Empty));
double[] total1 = new double[nrClasses];
double[] total2 = new double[nrClasses];
double[] total3 = new double[nrClasses];
double[] total4 = new double[nrClasses];



replace your loop with
C#
foreach (DataRow row in table.Rows)
{
   double nrClass = row["Class"] == DBNull.Value ? 0 : Convert.ToDouble(row["Class"]) -1;
    total1[nrClass] += row["SepalLength"] == DBNull.Value ? 0: Convert.ToDouble(row["SepalLength"]);
    total2[nrClass] += row["SepalWidth"] == DBNull.Value ? 0: Convert.ToDouble(row["SepalWidth"]);
    total3[nrClass] += row["PedalLength"] == DBNull.Value ? 0: Convert.ToDouble(row["PedalLength"]);
    total4[nrClass] += row["PedalWidth"] == DBNull.Value ? 0: Convert.ToDouble(row["PedalWidth"]);
}
 
Share this answer
 
v2
Comments
Member 10659035 1-May-14 15:28pm    
Thanks for the response. I think you misunderstood. That won't calculate the total of the first twenty five. That'll calculate the total of the whole table. I got 3 sets of 25, so 75 rows in the table. I want to total the 3 sets individually for other calculations.
Herman<T>.Instance 1-May-14 15:40pm    
o that is what you mean bij class. A schoolclass or something like that.
In that case you have to define each total as array, as you can see in the altered code
Member 10659035 1-May-14 15:55pm    
The fifth column is called "Class". So I got 25 of class 1, 25 of 2, 25 of 3. If your code works for this then I'll certainly try it. I didn't know if there was a way to just iterate through the first twenty five for class 1, then do another loop for the next 25 etc.
Member 10659035 1-May-14 16:22pm    
This doesn't work, anyone else understand what I'm trying to do?
Herman<T>.Instance 2-May-14 1:32am    
why it does not work?
it iterates all records, but sets the sum values in that part of the array that belongs to the class. In that case the sorting order of the data doesn't mather. The data is grouped by class in the array. Do you know how array works?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900