Click here to Skip to main content
15,885,998 members
Please Sign up or sign in to vote.
2.60/5 (2 votes)
See more:
I want to sort a text file with a few million of these algorithms to use.(quick sort)

My programming language is C#.

Its structure txt file is as follows:

for instance          desired Result
------------          ---------------      
  723,80                 1,4   
  14,50                  1,5 
  723,2                  10,8
  1,5                   14,50 
  10,8                  723,2 
  1,4                   723,80    



At the same time, and memory is very important to me.

This algorithm is suitable for the job?

If appropriate, please give an explanation of this algorithm. And give an example

Thank you
Posted
Updated 27-Jan-14 7:43am
v4
Comments
Peter Leow 22-Dec-13 10:53am    
This seems to be a re-post of a similar question from yesterday.
Sergey Alexandrovich Kryukov 22-Dec-13 12:26pm    
If this is some real application, not just the experiment, you are already abusing technology by having such a big file, not something more suitable for working with bigger volumes of data.
—SA

If memeory is the critical factor, then look at External sorting algorithms[^] - but be aware - they will be slower than memory intensive ones.
 
Share this answer
 
Comments
log988 22-Dec-13 10:31am    
It takes a few seconds to sort the way you say?
Time is also important to me
OriginalGriff 22-Dec-13 11:21am    
You normally have a trade off between time and memory: the more you use of one, the less you use of the other...it's a bit like the postal service. You can have "fast" or you can have "cheap", but not the pair!
Hi Friend,

Can you do some kind of merge sort with some way where, you are going to do this :

1. Create two buffer variables : size depends on you.

2. Create a file handle to the input file. You may need to do some heavy temporary data file generation on the disk. So, it's actually lightweight on RAM (or physical memory) of the system.

3. Load first block in buffer one and create a new thread and sort this chunk in buffer one, meanwhile, load next chunk in next buffer and when the thread completes it's action, it will store this data on an external temporary file (you can number and name the files uniquely.

4. Then use the buffer two to do the same process on this chunk also and while doing so, load next chunk in buffer one. This way you have created a double buffering system.

5. Now, once all the break and sort individual part is done, start to do next process, this is called merging part and you have to start reading in from temporary files you created, resort them if needed and merge them in a new file. This way, ultimately, the complete file is sorted and merged again.

6. Now, the only last thing you need to do is, overwrite the previous file by this new file, do some garbage collection by cleaning up all the temporary files etc.

7. Finally, close the file handle, kill the thread and give the user a message that sorting is completed successfully.

8. At the moment, I can say that this is the only logical method I can see so far, though I have to think more to come up with, if possible, something better.

Hope that this helped. Though you should also have a look at the link posted by OriginalGriff in Solution 1. Maybe that link contain some better way to do this.

With Regards
Tushar Srivastava
 
Share this answer
 
5 million strings, each less than 10 characters long (from your sample) is not such a big deal for main memory (50 MB).

I suggest that you load the whole file in a large buffer, as contiguous null terminated strings, together with an array of indexes to the start of every string (this will consume 4 additional bytes per row). You will use quicksort, sorting the array of indexes (and not moving the string themselves). Then output the sorted file.

It seems that your file contains numerical values, 1 per row. If this is always the case, you can consider loading the values to an array of floating-points (single or double precision, i.e. just 4 or 8 bytes per row), and sort the array in-place. Take care of the format conversions when inputting and outputting the values.
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900