copying milllions of small files and millions of dirs

Charles Swiger cswiger at mac.com
Thu Aug 15 19:26:26 UTC 2013


On Aug 15, 2013, at 11:13 AM, aurfalien <aurfalien at gmail.com> wrote:
> Is there a faster way to copy files over NFS?

Probably.

> Currently breaking up a simple rsync over 7 or so scripts which copies 22 dirs having ~500,000 dirs or files each.

There's a maximum useful concurrency which depends on how many disk spindles and what flavor of RAID is in use; exceeding it will result in thrashing the disks and heavily reducing throughput due to competing I/O requests.  Try measuring aggregate performance when running fewer rsyncs at once and see whether it improves.

Of course, putting half a million files into a single directory level is also a bad idea, even with dirhash support.  You'd do better to break them up into subdirs containing fewer than ~10K files apiece.

> Obviously reading all the meta data is a PITA.

Yes.

> Doin 10Gb/jumbos but in this case it don't make much of a hoot of a diff.

Yeah, probably not-- you're almost certainly I/O bound, not network bound.

Regards,
-- 
-Chuck



More information about the freebsd-questions mailing list