Where userland read/write requests, whcih is larger than MAXPHYS,
are splitted?
Lev Serebryakov
lev at serebryakov.spb.ru
Fri Dec 10 15:26:27 UTC 2010
Hello, Alexander.
You wrote 10 декабря 2010 г., 17:45:20:
>> I'm digging thought GEOM/IO code and can not find place, where
>> requests from userland to read more than MAXPHYS bytes, is splitted
>> into several "struct bio"?
>> It seems, that these children request are issued one-by-one, not in
>> parallel, am I right? Why? It breaks down parallelism, when
>> underlying GEOM can process several requests simoltaneously?
> AFAIK first time requests from user-land broken to MAXPHYS-size pieces
> by physio() before entering GEOM. Requests are indeed serialized here, I
> suppose to limit KVA that thread can harvest, but IMHO it could be
> reconsidered.
It is good idea, maybe to have GEOM flag for this? For example, any
stripe/geom3/geom5 code can process read of series of reads, for
example much fater, than sequentially -- if userland
want to read big blocks, bigger than stripe size. And small stripe
size is bad idea due to high fixed cost of transaction. Now, when
application read files on RAID5 with big blocks (say, read() is
called with 1Mb buffer), RAID5 geom sees read requests of 128Kb
in size, one by one. And with stripe size of 128Kb, it performs
as single disk :( I can add pre-read for full-sized reads, but
it is not generic solution, and sending BIOs from one
(logical/userland) read/write request without awaiting their
completion is generic solution.
> One more split happens (when needed) at geom_disk module to honor disk
> driver's maximal I/O size. There is no serialization. Most of ATA/SATA
> drivers in 8-STABLE support I/O up to at least min(512K, MAXPHYS) - 128K
> by default. Many SCSI drivers still limited by DFLTPHYS - 64K.
Yep, it is what I seen in my investigations.
--
// Black Lion AKA Lev Serebryakov <lev at serebryakov.spb.ru>
More information about the freebsd-hackers
mailing list