Click here to Skip to main content
15,921,660 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi guys,

>I'm working on a project that involves extracting the values from a file,that is in .npd format(this is not my problem) and then perform some calculations.
The file contains several columns of data(without any heading). I have to give each column a name, and do some calculations.

The raw file looks like :
1 $Abc 21:00:55 20 33 56 1 $Abc 21:00:56 22 34 56 
2 $Abc 22:00:34 43 30 45 2 $Abc 22:00:35 44 36 45
3 $Abc 22:23:23 19 88 67 3 $Abc 22:23:24 12 82 63
4 $Abc 23:40:29 20 20 20 4 $Abc 23:40:30 26 26 24
5 $Abc 23:55:00 34 21 28 5 $Abc 23:55:01 37 26 29  


1)I have to perform calculations on the 2nd, 3rd and 4th column and print it on a file with heading( Which I managed to do just now). for example add 1 to each value and give headings.
2)I have to delete the values starting from the 6th column!! I cannot use tokenise method, using "$" as delimiters because there are 2 columns starting with the symbol "$". And I want the values starting
The last6 columns must be completely removed.
Heading1 Heading 2 Heading 3
21       34        57         
44       31        46
20       89        67 
21       21        21
35       22        29


2) Then find the mean of each column.
Mean of column 1:
Mean of column 2: 
Mean of column 3: 


Also one of the columns contain 'N' and the column to be calculated is to the left of this column.
I know how to open the file, but sorting the data is a bit confusing, please help, Examples would be very helpful!!!
[Update]

I used this code to get the value from the 4th column, but I cannot make any calculations on these values. for example :
C++
void main{
str(_T("%21:00:55 $GPGGA 210055 6102.00399 six"));
CAtlString resToken;
int curPos = 24;
double val;
resToken= str.Tokenize(_T(" "),curPos);
_tprintf_s(_T("Resulting token: %s\n"), resToken);
}

This prints the 4th column value but I have to do some calculation on that variable, For which I have to convert from string to double and back to string to print the values. Is this feasible?

[/Update]
Posted
Updated 24-May-12 22:31pm
v12
Comments
Maximilien 22-May-12 8:57am    
What is an npd file?
from which software does it come from?
Is it ascii/binary?
Do you need to display the data to the user? (i assume so, if you have to add headers)
What platform/SDK are you using? Windows/MFC/STL ?
Sumal.V 22-May-12 9:06am    
NPD: Novell Printer Definition (NPD) files appear in the Printer Type list when you are creating a Printer Agent using the Novell Gateway's Print Device Subsystem (PDS).(Googled it : http://www.novell.com/fr-fr/documentation/nw51/docui/#../ndps_enu/data/hxiyy2w0.html)

Yes I'm working on windows and use visual C++. Yes I have to give a name to each column and display to the user,in a text format.
scottgp 23-May-12 11:28am    
In a case like this, would it make more sense to use a multi-purpose tool like R (http://www.r-project.org/), or some reporting tool, rather than writing a single use program?
Maciej Los 23-May-12 13:09pm    
Some text was moved from comment...

First off forget all this ATL/MFC rubbish, you're making stuff far more complicated than it needs to be. Just back away, hands where I can see them - reach for a CString and you're history.

The algorithm for this is pretty simple isn't it?
- Open an ouput file
- Output the column headings

- Open the input file
- Until you hit the end of file:
    - Extract a string (the date) and discard it
    - Read three ints
    - add them to three running totals
    - ouput them to the output file
    - update a line counter

- Divide each running total by the number of lines
- Output the means to the results file

So how would you do that in C++?

Well the input and output files are just std::ifstream and std::ofstream objects. Outputing data is just operator<< on the output file. Reading the data is just operator >> on the input file, the rest is just arithmetic.
 
Share this answer
 
Comments
Sumal.V 23-May-12 11:49am    
And to output the file is it better to use .csv file as I have to save them in columns and its rely confusing!
Aescleal 23-May-12 13:54pm    
That's just a detail change for the "output them to the output file".
Sumal.V 25-May-12 4:33am    
Yes I managed to open the source file, read the values and print them onto the destination file, with headings. But I have to discard the last set of values and again performing calculation on this is a little confusing..
Aescleal 25-May-12 5:18am    
If you mean you have to discard the last line of data in the file, then you can delay writing the values to the file and updating the totals and line counter by one line.

Ah, hang on, see what the problem is from the data file format. To get rid of the data in the columns you don't want read them into variables but don't do anything with them.
Sumal.V 25-May-12 5:30am    
Exactly, to not read the data from the file I have to set some kind of a condition and that can be either the discarding the column based on a format, but this row has 2 variables of the same format.

And I really don't understand what you mean by delaying ..
For each column, use a std::vector<int> to store read values and then perform all calculations you like on such vectors' data.


[update]
Member 8446342 wrote:
1 $Abc 21:00:55 20 33 56 1 $Abc 21:00:56 22 34 56
2 $Abc 22:00:34 43 30 45 2 $Abc 22:00:35 44 36 45
3 $Abc 22:23:23 19 88 67 3 $Abc 22:23:24 12 82 63
4 $Abc 23:40:29 20 20 20 4 $Abc 23:40:30 26 26 24
5 $Abc 23:55:00 34 21 28 5 $Abc 23:55:01 37 26 29


Since that is a pretty fixed format, you may quickly extract the needed values using sscanf, e.g.:
C++
int col[3];
char tmp[MAX_PATH];
CString s ="1 $Abc 21:00:55 20 33 56 1 $Abc 21:00:56 22 34 56";
if ( sscanf(s, "%s %s %s %d %d %d", tmp, tmp, tmp, &col[0], &col[1], &col[2])!= 6 )
{
  // handle error
}



[/update]
 
Share this answer
 
v3
Comments
Sumal.V 22-May-12 9:54am    
But how can I separate the columns? There are no commas, or any other delimiters between each column..
Maximilien 22-May-12 10:17am    
"space" is a delimiter.
Sumal.V 22-May-12 10:56am    
In one of the code, in order to ignore the comment lines, I can type:
if (line.Find(_T("%")) == 0)
return;
//where the line is a CSring and "%" is used for comment lines in .csv files
But now since I have to find many spaces ie to calculate the 4th column I have to check 3 spaces in a row, so how can I do that?
CPallini 23-May-12 4:49am    
You may use the CString::Tokenize method to extract the values from the input line. HAve a look at the documentation
http://msdn.microsoft.com/en-us/library/k4ftfkd2.aspx
Sumal.V 23-May-12 6:33am    
Thanks, But by using tokenise function I can use space as delimiter, but there is space after every column in the file and I have to extract a particular column say the 5th one, how can I do that?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900