sglist(9)
Scott Long
scottl at samsco.org
Mon Nov 30 22:13:23 UTC 2009
On Nov 30, 2009, at 12:27 PM, John Baldwin wrote:
> On Sunday 29 November 2009 6:27:49 pm Scott Long wrote:
>>> On Wednesday 20 May 2009 2:49:30 pm Jeff Roberson wrote:
>>>> On Tue, 19 May 2009, John Baldwin wrote:
>>>>
>>>> 2) I worry that if all users do sglist_count() followed by a
>>>> dynamic
>>>> allocation and then an _append() they will be very expensive.
>>>> pmap_kextract() is much more expensive than it may first seem to
>>>> be. Do
>>>> you have a user of count already?
>>>
>>> The only one that does now is sglist_build() and nothing currently
>>> uses that.
>>
>> Kinda silly to have it then. I also don't see the point of it; if
>> the
>> point of the sglist object is to avoid VA mappings, then why start
>> with a VA mapping? But that aside, Jeff is correct, sglist_build()
>> is
>> horribly inefficient.
>
> It actually does get used by the nvidia driver, but so far in what I
> have
> done in my jhb_bio branch I have tried several different approaches
> which
> is why the API is as verbose as it is.
>
>>> VOP_GET/PUTPAGES would not need to do this since they could simply
>>> append
>>> the physical addresses extracted directly from vm_page_t's for
>>> example. I'm
>>> not sure this will be used very much now as originally I thought I
>>> would be
>>> changing all storage drivers to do all DMA operations using sglists
>>> and this
>>> sort of thing would have been used for non-bio requests like
>>> firmware
>>> commands; however, as expounded on below, it actually appears better
>>> to
>>> still treat bio's separate from non-bio requests for bus_dma so that
>>> the
>>> non-bio requests can continue to use bus_dmamap_load_buffer() as
>>> they do
>>> now.
>>>
>>
>> I completely disagree with your approach to busdma, but I'll get into
>> that later. What I really don't understand is, why have yet another
>> page description format? Whether the I/O is disk buffers being
>> pushed
>> by a pager to a disk controller, or texture buffers being pushed to
>> video hardware, they already have vm objects associated with them,
>> no? Why translate to an intermediate format? I understand that
>> you're creating another vm object format to deal directly with the
>> needs of nvidia, but that's really a one-off case right now. What
>> about when we want the pagedaemon to push unmapped i/o? Will it have
>> to spend cycles translating its objects to sglists? This is really a
>> larger question that I'm not prepared to answer, but would like to
>> discuss.
>
> Textures do not already have objects associated, no. However, I did
> not
> design sglist with Nvidia in mind. I hacked on it in conjunction with
> phk@, gibbs@, and jeff@ to work on unmapped bio support. That was the
> only motivation for sglist(9). Only the OBJT_SG bits were specific to
> Nvidia and that was added as an afterthought because sglist(9) already
> existed in the jhb_bio branch.
>
> If you look at GETPAGES/PUTPAGES they already deal in terms of
> vm_page_t's,
> not VM objects, and vm_page_t's already provide a linear time way of
> fetching
> the physical address (m->phys_addr), so generating an sglist for
> GETPAGES and
> PUTPAGES will be very cheap. One of the original proposals from
> phk@ was
> actually to pass around arrays of vm_page_t's to describe I/O
> buffers. When
> Poul, Peter, and I talked about it we figured we had a choice
> between passing
> either the physical address or the vm_page_t. However, not all
> physical
> addresses have vm_page_t's, and it was deemed that GEOM's and disk
> drivers did
> not need any properties of the vm_page_t aside from the physical
> address. For
> those reasons, sglist(9) uses physical addresses.
>
>>>> In general I think this is a good idea. It'd be nice to work on
>>>> replacing
>>>> the buf layer's implementation with something like this that could
>>>> be used
>>>> directly by drivers. Have you considered a busdma operation to
>>>> load from
>>>> a sglist?
>>>
>>> So in regards to the bus_dma stuff, I did work on this a while ago
>>> in my
>>> jhb_bio branch. I do have a bus_dmamap_load_sglist() and I had
>>> planned on
>>> using that in storage drivers directly. However, I ended up
>>> circling back
>>> to preferring a bus_dmamap_load_bio() and adding a new 'bio_start'
>>> field
>>> to 'struct bio' that is an offset into an attached sglist.
>>
>> I strongly disagree with forcing busdma to have intimate knowledge of
>> bio's. All of the needed information can be stored in sglist
>> headers.
>
> The alternative is to teach every disk driver to handle the difference
> as well as every GEOM module. Not only that, but it doesn't provide
> any
> easy transition path since you can't change any of the top-level code
> to use an unmapped bio at until all the lower levels have been
> converted
> to handle both ways. Jeff had originally proposed having a
> bus_dmamap_load_bio() and I tried to not use it but just have a
> bus_dmamap_load_sglist() instead, but when I started looking at the
> extra
> work that would have to be duplicated in every driver to handle both
> types
> of bios, I concluded bus_dmamap_load_bio() would actually be a lot
> simpler.
You completely missed the part of my email where I talk about not
having to update drivers for these new APIs.
In any case, I still respectfully disagree with your approach to
busdma and bio handling, and ask that you let Attilio and I work on
our prototype. Once that's done, we can stop talking in hypotheticals.
Scott
More information about the freebsd-arch
mailing list