Click here to Skip to main content
15,881,172 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hey guys, I'm just wondering if you could help me and possibly give me some suggestions on how to achieve this. Ok, We have a file transfer solution which moves files from A to B and then to C. Files take a long time to transfer from A to B as there is over 300,000 files on the remote server so each request for files has to read the entire file system and return the newest files. What I want to do (if possible) is create a service which will grab say 30 files at a time from A and store them in memory and then have two or three separate services reading the files from the cache and then transferring them to C. Any help is very much apreciated.

Thanks in Advance.
Posted

1 solution

It seems to me that what you really need is a faster way to get the list of files on the file system, rather then keeping 30 files or so in memory.

In fact, a better solution given what you've written here is to keep the PATHS of those 30 critical files in memory, or in a log file - and have your remote clients that need those 30 files pull their locations from the log.

However, you didn't give any code or information about how this system works, so I don't think people here are going to be able to offer much help.

Good luck,

-Pete
 
Share this answer
 
Comments
frostcox 5-Feb-14 11:39am    
Hey thank you for your answer, There is just far too much code to post here and I don't think it would help even if I did post it. We basically transfer 100,000 + files a day. Currently we do as you suggest have each individual job pull the files from the remote server but there lies the problem. Say we have 10+ instances of the service running and each service is reading the remote file system which has 300,000+ files we have no control over the amount of files which are on the server and cannot delete them!!, we are just required to take the most recent. I'm thinking of having one service constantly poll the remote file system and then each instance will work off that pool of files instead of having to query the system every time.
Dave Kreskowiak 5-Feb-14 13:36pm    
It sounds like caching 30 files isn't going to do you any good at all. Frankly, I have no idea what you meant by that.

It sounds that though you need a service that will scan the remote file system, sort by last modified date, take the list of files that have a last modified date after the date/time the last scan was done. Keep track of this value.

This service can then feed other services that ask it for filenames to process. Though, I question the validity of having multiple services trying to do file transfers simultaneously as the remote server only has X amount of bandwidth. Having multiple services hitting it will only saturate the pipe, not speed up transfers.

The file transfer service will have to "check out" a filename to process, do it's thing, and then "check back in" that filename as "processed". The file scanning service can then tag that file as processed. Though, the scanning process will have to keep track of which filenames have been processed and which as still pending processing. The datetime of the last scan should never go beyond that of the file that was last successfully processed.
pdoxtader 5-Feb-14 13:42pm    
"It sounds like caching 30 files isn't going to do you any good at all. Frankly, I have no idea what you meant by that."

Dave, I took it to mean that he's got some code that searches the whole file system (or a large part of it) every time a remote machine needs a file. That's why he and I were talking about caching or log files with the names of the 30 (or so) most downloaded files.

-Pete
Dave Kreskowiak 5-Feb-14 16:42pm    
That's what I originally thought, but then he said the thing would be transferring 100,000 files day. Kind of makes a 30 file cache a bit inadequate.
pdoxtader 5-Feb-14 13:47pm    
If you can install a service on the machine containing the files, you can just set up a FileSystemWatcher, something like this:

http://www.codeproject.com/Articles/3192/Watching-Folder-Activity-in-VB-NET

and then have your remote machines connect to that to get the paths of the files they need to transfer.

-Pete

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900