Click here to Skip to main content
15,881,797 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi all,

1) In a little project I'm working on, I read a file, extract values, process them and output the final result. But this file format is not consistent.

i.e, The files are from a seabed survey, consisting of latitude, longitude, height, and GPS values. Sometimes there are 2 GPS's values and sometimes there is just one.

But all these files contain default variables that are used for calculations. How can I make this program consistent? Because, I have to change the swscanf function every single time, I have to read a different file format.

2) Another question :- As the calcultions involve calculations of mean I have to open the file many times, ie I have to read the file from first to last row many times,
1) After the first read- Calculate Total, count;
2) In the second read - Use the average value for some calculations on every line in the file and output.
3) After third do some calculations involving from values obtained from the previous step...

I read my file 3 times plus 1 (initially to get the start time and end time). Is there a way to get rid of this and make it simpler?

Note:
The file size varies from 320 KB to 2100 KB. The format:-

With 2 GPS
Type 1:
21:00:55.035,$GPGGA,210055.00,6102.0039902,N,00107.5920998,E,2,09,1.2,23.57,M,47.57,M,10.0,1011*4E,$GPGGA,210054,6102.00388,N,00107.62266,E,2,10,1.0,025.33,M,047.57,M,13,0777*76,$HEHDT,000.4,T*2B,0.67,
Type 2: 16:39:09 $GPGGA 163908 5838.966652 N 131.910585 E 2 9 0.8 67.4 M 0 M 8.4 1006*42 $GPGGA 163909 5838.94843 N 131.89993 E 1 10 0.8 22.77 M 46.33 M 0000*78 $HEHDT 292.3 T*25 293

With 1 GPS
Type 3:$GPGGA,140823.00,7434.7372191,N,05713.8933749,W,1,06,1.2,48.77,M,21.98,M,,*4E
Type 4:$GPGGA,154349.00,7434.7372530,N,05713.8941497,W,2,07,1.7,45.04,M,21.98,M,45.2,0081*5C,

Even the time format is different, For the first one I can output as it is but for the type 3 I got to format and then display as I use them for plotting graphs.
Posted
Updated 13-Aug-12 4:18am
v3
Comments
[no name] 13-Aug-12 9:40am    
Why don't you read the file once and do all the calculations once? Reading the file 4 times seems to be 4 times too many.
Sumal.V 13-Aug-12 9:49am    
Hey sorry, I've updated my question. That is because after I read the first time, I calculate the average and in the second read I apply this average value to every line in the file n so on ...
[no name] 13-Aug-12 9:53am    
And? Read the file once then you have all the data in memory... am I missing something here?
StianSandberg 13-Aug-12 9:50am    
or 3 times too many.. :)
[no name] 13-Aug-12 9:53am    
Was on purpose... :-)

Quote:
Sometimes there are 2 GPS's values and sometimes there is just one.

Why not just read the whole file and interpret what's in it, not sure why your scanning of the file has to change.
Quote:
I read my file 3 times in total. Is there a way to get rid of this and make it simpler?

Yeah, do the calculations as you go, no need for you to open the file every time.

For example, calculate average weight (just example):
0. Read entry in file
1. Calculate: TotalWeight += EntryWeight;
2. Calculate: Entries += 1;
3. Calculate: AvgWeight = Entries / TotalWeight;
4. Go back to step 0 until you get to the end of file

...you just have to update your calculations to reflect where you are in a file, no need to read it over again.
 
Share this answer
 
Comments
Sumal.V 13-Aug-12 10:27am    
This is a sample code:
//1 read
if (csf.Open(fname, CFile::modeRead))
{
while (csf.ReadString(buffer) )

{
out = swscanf_s(buffer, _T("%[^,], %lf, %lf, %[^,], %lf, %[^,], %d, %d, %lf, %lf, %[^,], %lf, %[^,], , %[^,]"),
str1, 2046, &sTime, &db1, str4, 2046, &db2,str5, 2046, &in1, &in2, &db3, &db4, str6, 2046, &db5, str7, 2046,str10,2046);

double templat = int( lat1 / 100 ) + ( (lat1 / 100 - int( lat1 / 100 )) / 0.6 ); // latitude in degrees
double templon = int( lon1 / 100 ) + ( (lon1 / 100 - int( lon1 / 100 )) / 0.6 ); // longitude in degrees
//lat/lon calculations
sumLat = sumLat + templat;
count++; // int count = 0;
}
else
{
//error report
}

}
csf.Close();
}

meanLat = sumLat / count;

//Second Read
if (csf.Open(fname, CFile::modeRead))
{
while (csf.ReadString(buffer) )

{
out = swscanf_s(buffer, _T("%[^,], %lf, %lf, %[^,], %lf, %[^,], %d, %d, %lf, %lf, %[^,], %lf, %[^,], , %[^,]"),
str1, 2046, &sTime, &db1, str4, 2046, &db2,str5, 2046, &in1, &in2, &db3, &db4, str6, 2046, &db5, str7, 2046,str10,2046);
lattemp = db1;
lat = int( lattemp / 100 ) + ( (lattemp / 100 - int( lattemp / 100 )) / 0.6 );
latRad = lat * (ML::pi)/180;
deltaLat = latRad - meanLat ; //MeanLat is used for every single line.
.............
Albert Holguin 13-Aug-12 10:45am    
Why don't you just read the values into a list of some sort (or vector)? Then you can do you delta calculations after the fact without reading from disk again.... alternatively, you can adjust your previously calculated deltas based on your changing mean (as you calculate it).
Sumal.V 14-Aug-12 8:28am    
Yeah I'm trying to work with lists as explained in the solution 2. Thanks :)
Create a class that can contain all of the data for one line.

C++
class CSurveyLine
{
  public:
    CTime DateRecorded;
    CString DataType;
    Etc....

  public:
    virtual bool SetFieldData(LPCSTR szDateRecorded, LPCSTR szDataType, etc...);
    virtual bool SetFieldData(LPCSTR szDateRecorded, LPCSTR szDataType, etc...);

  protected:
    virtual bool ParseDateField(CTime & MemberDate, LPCSTR szDate);
    virtual bool SetDateRecorded(LPCSTR szDateRecorded) { return ParseDateField(DateRecorded, szDateRecorded); }
};


Since you have different input variants, you can create different SetFieldData() methods. If they have different arguments then you can overload the name, otherwise you may need to give them different names to differentiate them.
You may find it convenient to create methods to set the field data too, if you have to parse different field formats.
And one step further, you could make that conversion generic...

Create a vector to hold the data.

C++
std::vector<csurveyline> SurveyTable;</csurveyline>


Use push_back to add a new entry.
C++
{
   CSurveyLine SurveyLine;
   SurveyLine.SetFieldData(...);
   SurveyTable.push_back(SurveyLine);
}


Then walk through the table to calculate your stats,

C++
for (size_t Index = 0; Index < SurveyTable.size(); Index++)
{
   CSurveyLine & SurveyLine = SurveyTable[Index];

   /* Do Stuff to SurveyLine */
}
 
Share this answer
 
v2
Comments
Eugen Podsypalnikov 14-Aug-12 3:50am    
Nice, without some keys "virtual" as well :)
Sumal.V 14-Aug-12 4:32am    
Thanks so much for the wonderful example. It looks like an complicated exam question with lot of clues. Will give this a try :)
JackDingler 14-Aug-12 11:57am    
This is stuff I do everyday...

It'll become easier for you with practice.
JackDingler 14-Aug-12 11:53am    
Converting to a map wouldn't be too difficult. You could key of longitude and latitude, or any other criteria you like...
JackDingler 14-Aug-12 11:58am    
Oh, perhaps you're pointing out the lack of a constructor and virtual destructor?

Gotta leave something as an exercise for the reader. :)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900