DFLTPHYS vs MAXPHYS
Matthew Dillon
dillon at apollo.backplane.com
Tue Jul 7 22:15:37 UTC 2009
:I will disagree with most of this
:- the amount of read-ahead/clustering is not very important. fs's already
: depend on the drive doing significant buffering, so that when the fs gets
: things and seeks around a lot, not all the seeks are physical. Locality
: is much more important.
Yes, I agree with you there to a point, but drive cache performance
tails off very quickly if things are not exactly sequential in each
zone being read, and it is fairly difficult to achieve exact
sequentiality in the filesystem layout. Also command latency really
starts to interfere if you have to go to the drive every few name
lookups / stats / whatever since those operations only take a few
microseconds if the data is sitting in the buffer cache, even if
its just going to the HD's on-drive cache.
The cluster code fixes both the command latency issue and the problem
of slight non-sequentialities in the access pattern (in each zone being
seek-read). Without it performance numbers will wind up being all
over the board. That makes it fairly important.
I got a nifty program to test that.
fetch http://apollo.backplane.com/DFlyMisc/zoneread.c
cc ...
(^C to stop test, use iostat to see the results)
./zr /dev/da0 16 16 1024 1
./zr /dev/da0 16 16 1024 2
./zr /dev/da0 16 16 1024 3
./zr /dev/da0 16 16 1024 4
If you play with it you will find that most drives can track around
16 zones and 100% sequential forward reads in each zone. Any other
access pattern severely degrades performance. For example if you
read the data in reverse you can kiss goodbyte to performance. If
you introduce slight non-linearities in the access pattern, even though
the seeks are within 16-32K of each other, performance degrades very
rapidly.
This is what I mean by drives not doing sane caching. It was ok with
smaller drives where the non-linearities were hitting up against the
need to do an actual head seek, but the drive caches in today's huge
drives are just not tuned very well.
UFS does have a bit of advantage here but HAMMER does a fairly good
job too. The problem HAMMER has is with its initial layout due to
B-Tree node splits (which messes up linearity in the B-Tree). Once
the reblocker cleans up the B-Tree performance is recovered. The
B-Tree is the biggest problem, but I can't fix the initial layout
without making incompatible media changes so I'm holding off on
doing it for now.
-Matt
More information about the freebsd-arch
mailing list