How to transfer many files ~10^7 quickly between drives

Jsplinter

Distinguished
Oct 20, 2010
83
0
18,630
I am copying many files (appr. 10^8 files, 150 GB) from one internal HDD to another. Each drive is 7200RPM, but I'm getting very slow rates (under 1.5MB/s). I did not install the drives as a RAID, just basically plugged them in. The receiving disk will only have a couple 100GB free after the transfer, so I guess fragmentation could be accounting for some of the slow down. The processor is running between 10-15% and RAM usage is around 45%.

I found this thread which mentioned it could be changes to the file system that are causing the delay.

It also mentions copying the files at the block level would be faster. Is that something practical to do? As it is now, it is going to take an entire day to copy these files, and copying large amounts of data is something I will need to do many times in the upcoming months.

Thanks for any advice.
 
Solution
Yes, cloning might be faster. But by doing a file-by file copy (which does take time!) you are defragmenting the resulting HDD, because the file writing will tend to use contiguous blocks of space. The result may be better performance in the long run.

price_th

Distinguished
Jan 29, 2012
870
0
19,160
You could Ghost clone the drive. That only an option if there is nothing else on the destination drive. I've found that copying large numbers of files is best done in blocks of folders due to the file structure being extensive.
 

j2j663

Distinguished
Apr 29, 2011
414
0
18,860
Your problem is the number of files you are trying to transfer. You would be right to expect faster rates if you were transferring say a few HD movies. It is all in how things are read from a mechanical HDD. Large HD movies are sequential, in that one memory block is generally right after the one before it.

You however are transferring much smaller files. Individual files have no guarantee to be sequential, most of the time they aren't sequential. Your guess about fragmentation is partially correct but even defragmenting your HDD isn't going to help. You still have no guarantee that the files will be read sequentially.

After every file is read your HDD has to wait to reposition the head to find the next file. So what you are running into is the worst case scenario for HDDs.

Cloning the disk would be a faster way of doing it because there will be much less head movement and much more sequential reading.
 

Paperdoc

Polypheme
Ambassador
Yes, cloning might be faster. But by doing a file-by file copy (which does take time!) you are defragmenting the resulting HDD, because the file writing will tend to use contiguous blocks of space. The result may be better performance in the long run.
 
Solution