Click here to Skip to main content
15,896,154 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi,

I have a directory. Under that directory, I have 26 folders. Under each folder I have 26 more folders. Now under each sub folder, I have 50 text files. I need to open each text file and read one section to create one data structure.

so the overall picture is:

C#
foreach(folder 1 to 26)
{
  foreach(folder 1 to 26)
  {
    foreach(text_file 1 to 50)
    {
      //Open the text file and read the section to create Datastructure//
    }
  }
}


As of now, when I am doing this activity, its taking me some 13- 14 mins. I tried doin the same with MultiThreading but its still taking me same time almost :(

How can I do the same thing in a much lesser time,

Any help would be appreciated.

Thanks,
Posted
Updated 17-Mar-12 15:46pm
v2
Comments
Sergey Alexandrovich Kryukov 17-Mar-12 22:52pm    
Why?
--SA

The best thing to do to increase your performance in this case is rethink your design.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 17-Mar-12 22:51pm    
Agree; 33800 files in 676 directories sounds quite sloppy. My 5.
--SA
[no name] 17-Mar-12 22:54pm    
Thanks. If he would have provided more information, we might have been able to suggest alternatives. But yes, this will probably give me nightmares.
Sergey Alexandrovich Kryukov 18-Mar-12 14:51pm    
Agree.
Cheers,
--SA
ProEnggSoft 18-Mar-12 1:47am    
Good advice. 5!
We can't really answer this because you haven't shown how you're reading the files, nor how you're processing the data. Those two things are your best opportunities for optimization.

Multithreading won't help you very much at all because:

1) I/O operations (disk access) are slow, and the disk can only physically read OR write a single (gross generalization here!) file at a time. Disk I/O is probably also the slowest portion of your process.

2) Since the Disk I/O is slower than your object creation (it SHOULD be anyway!), all of your threads are stalled waiting on disk operations. If your code was more calculation dependent (refered to as 'compute bound') instead of waiting for the disk (refered to as 'I/O bound'), threading would have improved performance, but usually only to the point where the number of threads is less than or equal to the number of cores in your CPU.

What can you do then?? Well, since you're reading over 33,000 files, there is not much you can do to speed this up unless your file reading code and section processing code is badly written to start with. Since we have no idea what your code looks like that does these things, there isn't anything anyone can tell you to help you out. That is, unless you modify your post to include the relevant code snippets.

...but, don't expect miracles. You're not going to get massive performance increases that'll reduced the load time below, say, a few minutes. You're just processing way too many files...
 
Share this answer
 
v2
Comments
ProEnggSoft 18-Mar-12 1:46am    
Nice explanation. 5!
How are you reading the text files?
I think if you use File.ReadAllText method
http://msdn.microsoft.com/en-us/library/ms143368.aspx[^]
it may improve the performance
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900