Click here to Skip to main content
15,885,366 members
Everything / Statistics

Statistics

statistics

Great Reads

by Andy Allinger
Add features to k-means for missing data, mixed data, and choosing the number of clusters
by Darko Jurić
SIR Particle Filter brief tutorial with samples in C#
by Ata Amini
Implement Gauss-Newton algorithm in Java to solve non-linear least squares problems; i.e. to find minimum of a function.
by Jack Devey
This post delves into the perplexing Monty Hall paradox, examining the probabilities associated with sticking or switching doors in the game scenario.

Latest Articles

by Jack Devey
This post delves into the perplexing Monty Hall paradox, examining the probabilities associated with sticking or switching doors in the game scenario.
by Andy Allinger
Add features to k-means for missing data, mixed data, and choosing the number of clusters
by DrABELL
Statistical Outliers detection in Microsoft Excel worksheet using Median() and array formula
by Andy Allinger
Introduces data clustering and the k-means++ algorithm

All Articles

Sort by Score

Statistics 

7 Jun 2023 by Andy Allinger
Add features to k-means for missing data, mixed data, and choosing the number of clusters
27 Apr 2015 by Darko Jurić
SIR Particle Filter brief tutorial with samples in C#
13 Mar 2017 by Ata Amini
Implement Gauss-Newton algorithm in Java to solve non-linear least squares problems; i.e. to find minimum of a function.
13 Jun 2023 by Jack Devey
This post delves into the perplexing Monty Hall paradox, examining the probabilities associated with sticking or switching doors in the game scenario.
22 Nov 2016 by Miguel Diaz Kusztrich
Using R to explore complexity of time series generated by simple process
8 Dec 2016 by rerhart585
Using SQLite, leverage the create_aggregate(), and SQL's Between Operator to create a Normal Probability Distribution Histogram, or what is more commonly referred to as a Bell Curve.
15 Jan 2022 by NewbieAR
I have to call R from python and show the results graphically through R. For example, if we take Apache, to look at different scores in CVSS database; CVE no, the risk scores, severity, etc. components to use - R, python, and graphical console based tool What I have tried: Getting data in R...
13 May 2018 by Christian Graus
You should perhaps spend some time learning how to use the internet. No one knows what you're responding to, because you posted this as a question
24 May 2018 by OriginalGriff
"Zen and the art of motorcycle maintenance" "Enders Game" "A brief history of time" "Code Complete" "The Pragmatic Programmer" TBH, maths, stats, algorithms, none of that will help you: it's a mindset you need to develop, and that doesn't take "knowledge" so much as "skill" - and that comes...
9 Dec 2021 by Dave Kreskowiak
You don't have a choice here. Median values, by definition, require an ordered set to find them.
9 Dec 2021 by BillWoodruff
You could use the 'Median method in the open-source Math.NET Numerics library: [^]. For an interesting idea for an O(n) algorithm: [^] ... note: i have not used it.
14 Jan 2024 by Code Artist
Consider this algorithm Implementation Downsampling Algorithm in MSChart Extension[^] to plot large data size without loosing details of trends and performance impact.
17 Jan 2015 by Peter Leow
Make use of one of these free data mining tools[^].
20 Mar 2015 by Dinuka Jayasuriya
I want to create a program that decides which option best suits a certain query based on previous queries and their respective suggestions proposed. Basically, 'training' the system to have an idea which suggestion most suits a question.Example:In this scenario patients input a...
20 Mar 2015 by manchanx
I think the Apriori-Algorithm would be the right tool for your requirement. An excerpt from an article about that[^] here on CodeProject:The whole point of the algorithm (and data mining, in general) is to extract useful information from large amounts of data. For example, the information...
27 Apr 2015 by myriame
How using AIC and BIC for selection hmm model,how calculate number of parameters help me please.
12 Jun 2015 by OriginalGriff
Wikipedia to the rescue!"SimilarWeb uses data extracted from four main sources: 1) A panel of web surfers made of millions of anonymous users equipped with a portfolio of apps, browser plugins, desktop extensions and software; 2) Global and Local ISPs; 3) Web traffic directly measured from a...
18 Jun 2015 by nizam qbixx
The HTK result output a result in a form of like this:"*/2.rec" 0 1100000 h -1049.205078 HELLO 1100000 1500000 e -385.533966 1500000 2700000 l -1004.266296 2700000 3500000 l -586.160156 3500000 3800000 o -281.648132 so, as I know, the -ve value (right side) is the log...
11 Nov 2015 by Asanka Perera
Hi,In this below research paper , in table Table 2. Columns are named as A,NP,PP,NR and PR.A represents Accuracy. But what does NP,PP,NR and PR stands for ? I assume these are precision/recall values.Is NP and PP for negatively predicted and positively predicted...
11 Nov 2015 by OriginalGriff
Ask the author(s), not us: they can give you a definitive result. They give you their email addresses at the top of the paper...
11 Nov 2015 by Abhinav S
Alternately, check if the author provides a forum or blog where you can post questions on the paper so he can respond to them.
3 Dec 2015 by Gun Gun Febrianza
hello dear i saw this article to learning about naive bayes.here is the article :Naive Bayes Classifier[^]so far i am understand the theory but i cant understand how to get this RESULT p(weight | male) = 5.9881e-06
3 Dec 2015 by George Jonsson
The same formula is used for all the calculationp(x) = 1/sqrt(2*PI*σ2) * exp(-(x - μ)2/2*σ2)where x is the sample value (length, weight or foot size)μ is the mean valueσ2 is the varianceSo in numbers for the weight for the male it will bex = 130μ = 176.25σ2 =...
24 Feb 2016 by Fred Andres
using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.Windows.Forms;using System.Windows.Forms.DataVisualization.Charting;namespace ARIMA_simulator{ class modelClass { private double random_normal() { ...
7 Jan 2017 by Alberto Nuti
This could be a good starting point:var df = new List();var bs = new List();var percentiles = new List[100];for(int i = 0; i ();}var line_idx = 0;foreach(var line in enumerate(in1)){ ...
28 Feb 2017 by Patrice T
Difficult to give advices with so little details.Quote:1. I want to know if there are risks involved in this? There is no risk with a software written will enough. Note that it is always possible to write a program so bad, so biassed that it will need more resources as time goes on and in...
20 Jan 2018 by David_Wimbley
So you pretty much need to google tutorials on how to use pyhton and R...unless you've actually attempted what you are trying to accomplish and have some code to show where you are stuck...a google search for tutorials is going to be the best place for you to start. Google[^] Now once you've...
3 Mar 2018 by schlebe
I never posted an article du CodeProject, but now in my work, it is recommanded to publish some of them. I will do that but the first interesting article that I try to publish is a statistical article. Can I post it on CodeProject ? This is an interesting analyze on Risk calculation (about 40...
3 Mar 2018 by Richard MacCutchan
If your article is useful to the programming community and helps programmers resolve problems, or teaches them some particular feature, then yes, you can publish it here. If it is purely relating to statistics then probably this is not the right place.
28 Mar 2018 by Joe Doe234
hi, I have the following log file: 2018-03-20 16:28:09,333 INFO [1] Luhn_Check.ResultForm.btnexport_Click - guy1 : generated 50 transactions. 2018-03-20 16:29:09,333 INFO [1] Luhn_Check.ResultForm.btnexport_Click - guy2 : generated 50 transactions. 2018-03-20 16:30:09,333 INFO [1]...
28 Mar 2018 by Christiaan van Bergen
Hi there, A long time ago I wrote Converting-text-files-CSV-to-DataTable. Not many of us still make use of them anymore, but every now and again I find cases where it would still be applicable. You could use the code below as a console application and use it on your log files. The regular...
24 May 2018 by Chris Guyette
What are the best non-programming books/topics (e.g on math, statistics, etc) that can help you become a better programmer? What I have tried: If you have some advices I will really appreciate that. Thanks
24 May 2018 by R. Giskard Reventlov
The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd Edition): Frederick P. Brooks Jr.: 8580001065793: Amazon.com: Books[^] CJ Date - Database Primer[^]
24 May 2018 by Patrice T
Quote: What are the best non-programming books/topics (e.g on math, statistics, etc) that can help you become a better programmer? Answer: None and all. It is like asking which is the best cooking book to learn to pilot the space shuttle ? The answer is any cooking book because none will teach...
6 Aug 2018 by Member 13939747
I have a data with 2 columns. Var1 - school classes, Var2 - names of students in the class. What command can I use to make a new matrix that will show how many times are repeated unique names in var2 for each unique var1? Var1 Var2 9 Sarah 9 John 12 Sarah 11 Veronica 10 ...
29 Jan 2020 by OriginalGriff
We are more than willing to help those that are stuck: but that doesn't mean that we are here to do it all for you! We can't do all the work, you are either getting paid for this, or it's part of your grades and it wouldn't be at all fair for us to do it all for you. So we need you to do the...
29 Jan 2020 by OriginalGriff
Repost: deleted. Posting the same lack of a question repeatedly wastes the time and effort of volunteers, and that's rude. And doesn't change the solution.
9 Oct 2020 by Member 14960302
About 1 out of 1,000 items made on a production line is defective. There is a test to check whether the item is defective. The test is quite accurate. In particular, we know that the probability that the test result is positive (suggesting the...
9 Oct 2020 by OriginalGriff
We are more than willing to help those that are stuck: but that doesn't mean that we are here to do it all for you! We can't do all the work, you are either getting paid for this, or it's part of your grades and it wouldn't be at all fair for us...
16 Oct 2020 by Rahul Betageri
I am developing a fuel related project where one of the tasks is to show a chart representing the fuel consumption of the vehicle (based on a data that is coming from the fuel sensor device) on node js. So, I get the raw fuel data like an...
16 Oct 2020 by OriginalGriff
Have you considered a moving average[^] - I use them to even out short-term variations and show the longer term trends.
2 Aug 2021 by nikita agarwal Jun2021
I am trying to make a 3D plot of a galaxy catalog and have a large amount of x,y,z coordinates and data value (w4) stored in seperate hdf5 files. Since the data content is huge, I have tried binning them. The output is however taking forever to...
2 Aug 2021 by Richard MacCutchan
You have already posted this question at Creating bins of 3D points (large dataset) with Python, taking long time to load[^]. Please do not repost.
2 Aug 2021 by Dave Kreskowiak
Well, there's a couple bottlenecks. First, Python is an interpreted language so it's going to be slower than compiled languages. Second, try and read 5.5GB of data and do NOTHING with it. Just read the files and throw the content you read away....
9 Dec 2021 by Admin BTA
I would like an efficient and non-invasive way of finding a median of a large array (a double array of size 300,000). What I have tried: I have tried an inefficient method: double Median(double[] xs) { Array.Sort(xs); return xs[xs.Length...
22 Oct 2022 by Member 14991075
I would like to create a semivariogram, but my code no use. command window shows me: Error in validObject(.Object) : invalid class “SpatialPointsDataFrame” object: invalid object for slot "data" in class "SpatialPointsDataFrame": got class...
28 Jan 2023 by TejasviniMK
I'm plotting a graph using PictureBox. To fit all the points in the control width, I'm using normal averaging technique. I want to understand is this the only way to do so, or there's any other formula. I feel like loosing the data trend (I can...
14 Jun 2017 by Andy Allinger
Introduces data clustering and the k-means++ algorithm
23 Mar 2019 by DrABELL
Statistical Outliers detection in Microsoft Excel worksheet using Median() and array formula
15 Jan 2015 by Darko Jurić
Discrete Kalman Filter brief tutorial with samples in C#
6 Aug 2018 by OriginalGriff
See the documentation: FileSystemWatcher Class (System.IO)[^] - it includes example code.
10 May 2023 by Dave Kreskowiak
Did you Google for "kruskal wallis C#" or "anova C#"?
30 Oct 2016 by Miguel Diaz Kusztrich
Complete algorithm for correspondence analysis to add to your own statistical class clibrary
18 Sep 2014 by Brady Kelly
I have, e.g. a chart of voltages measured during the day, so I decided to make the minimum Y axis the lowest voltage, and the maximum the highest voltage. So I get hundreds (sampled per minute) of 240V, and one or two 0V. This means the whole graph is at the very top, all compresses, with a vast...
30 Aug 2016 by Peter_in_2780
If the article you are referring to is published here on Code Project, there is a discussion forum at the end of the article. Posting your message there will contact the author directly. Usually, any code in articles posted here is subject to the CPOL: Code Project Open License[^]
3 Dec 2015 by phil.o
Unfortunately, last time this user was seen was almost five years ago.You may have to get back to the wikipedia article he is talking about, and get some clue about how values are computed. You may also study the source code of said article to catch how it works.I have read the article,...
10 May 2023 by Member 14007122
How to do 'Kruskal Wallis Test' in .NET/C#? I aleady tried to use some libraries(Accord.Net/MathNet.Numerics.Statics...) Is there any free library not Extreme.Numerics/Meta.Numerics etc... Please let me know about free library for Kruskal...
6 Jan 2017 by dfarr1
Hey folks - I'm definitely not a python guy but I am a C# guy. Can I get a hand trying to figure out how I can recreate this code snippet in c#?Background:I have a tab-delimited file that holds a p-value/critical value (for chi squared analysis). There are 9 columns of data. The first...
28 Feb 2017 by kuharan
A software is going to run for almost a day. I do not want my personal home pc to blast off. Basically, the application will run and collect usage statistics and then store into a database.The application will then retrieve values from the database to perform calculations on them.Based on...
6 Aug 2018 by Joe Doe234
Hi guys, I am trying to generate statistics on C# winforms and the data is stored into a txt file. How can i know when a new data has entered into this txt file? (To update the statistics) it is practicle to do so ? maybe there are other ways which are easier? What I have tried: I cannot...
21 Jan 2022 by Member 15500952
Hello - I found TheCyberByte article "How to retrieve CVSS Scores with Bulk CVE Lookup in NIST NVD via Python" using Beautiful Soup particularly helpful. I used a Ubuntu VM (Ubuntu 20.04.3 LTS) and basic text editor (vi) at command line to...
17 Jan 2015 by Member 11382724
I have for example the International Mathematical Olympiad (IOM) results of a country for years 2013 and 2012. I want to statistically asses if their is a significant improvement from 2012 to 2013. With the given: Age, Gender, and GPA. What test should I use? Also, is this study a one shot case...