aac(4) resource FIB starvation on BUS scan revisited
Jung-uk Kim
jkim at FreeBSD.org
Tue Dec 8 16:22:17 UTC 2009
On Monday 07 December 2009 11:04 pm, Scott Long wrote:
> On Dec 7, 2009, at 9:00 PM, Alexander Sack wrote:
> > On Mon, Dec 7, 2009 at 8:14 PM, Scott Long <scottl at samsco.org>
wrote:
> >> On Dec 7, 2009, at 6:05 PM, Jung-uk Kim wrote:
> >>> On Monday 07 December 2009 07:47 pm, Scott Long wrote:
> >>>> On Dec 7, 2009, at 5:31 PM, Jung-uk Kim wrote:
> >>>>> On Monday 07 December 2009 05:30 pm, Alexander Sack wrote:
> >>>>>> On Mon, Dec 7, 2009 at 4:42 PM, Alexander Sack
> >>>>>> <pisymbol at gmail.com>
> >>>>>
> >>>>> wrote:
> >>>>>>> Folks:
> >>>>>>>
> >>>>>>> I posted a similar thread on freebsd-scsi only to realize
> >>>>>>> that scottl had fixed my first issue during some MP CAM
> >>>>>>> cleanup with respect to a race during resource allocation
> >>>>>>> issues on a later version of the driver we are using (I
> >>>>>>> believe we did the same thing to resolve a lock issue on
> >>>>>>> bootup).
> >>>>>>>
> >>>>>>> However on my RELENG_8 box with (2) Adaptec 5085s connected
> >>>>>>> to some JBODs (9TB each) I still have a FIB starvation
> >>>>>>> issue during the LUN scan:
> >>>>>>>
> >>>>>>> The number of FIBs allocated to this card is 512 (older
> >>>>>>> cards are 256). The max_target per bus is 287. On a six
> >>>>>>> channel controller with a BUS scan done in parallel I see a
> >>>>>>> lot of this:
> >>>>>>>
> >>>>>>> ...
> >>>>>>> (probe501:aacp1:0:214:0): Request Requeued
> >>>>>>> (probe501:aacp1:0:214:0): Retrying Command
> >>>>>>> (probe520:aacp1:0:233:0): Request Requeued
> >>>>>>> (probe520:aacp1:0:233:0): Retrying Command
> >>>>>>> (probe528:aacp1:0:241:0): Request Requeued
> >>>>>>> (probe528:aacp1:0:241:0): Retrying Command
> >>>>>>> (probe540:aacp1:0:253:0): Request Requeued
> >>>>>>> (probe540:aacp1:0:253:0): Retrying Command
> >>>>>>> (probe541:aacp1:0:254:0): Request Requeued
> >>>>>>> (probe541:aacp1:0:254:0): Retrying Command
> >>>>>>> ....
> >>>>>>>
> >>>>>>> I think the driver is much happier with the following
> >>>>>>> attached patch (with dmesg).
> >>>>>>
> >>>>>> Patch again but this time not base-64 encoded:
> >>>>>
> >>>>> [SNIP!]
> >>>>>
> >>>>> I want it to be little conservative here, i.e.,
> >>>>> pre-allocating half of max_fibs. Will the attached patch
> >>>>> work for you?
> >>>>
> >>>> The FIB allocation scheme was written when it was common for
> >>>> machines to only have 64MB of RAM and proportionally less KVA,
> >>>> so 256KB or 512KB was a lot of RAM to wire down. Those days
> >>>> have probably passed.
> >>>
> >>> So, what would do if you were hypothetically rewriting it
> >>> today? :-)
> >>
> >> Most hardware have mechanisms for probing their command queue
> >> depth. What I
> >> typically do these days is allocate a minimum number of commands
> >> so that
> >> this probing can be done, then do a single slab allocation based
> >> on the
> >> results. AAC doesn't have this capability, but the 256/512 size
> >> is pretty
> >> well understood. The page-by-page allocation of aac works, but
> >> adds extra
> >> bookkeeping and complication to the driver.
> >
> > Right Scott, that is what JK and I discussed this evening. I
> > figured the 128 macro was just historical cruft and your email
> > confirms it. So are we ALL okay with the original patch as it
> > stands for now? JK I am fine with the divide 2 change but I
> > think raising it to 256 is really the way to go at this point!
> > :D
>
> If you're going to increase it, why not simply increase it to the
> max amount that is appropriate for each card?
My intention was to minimize impact as little as possible, i.e.,
old card: max fibs == 256, max fibs / 2 == 128, no change
new card: max fibs == 512, max fibs / 2 == 256, twice
Old cards are most likely to be used on old systems with very little
RAM (if they are still in production). Hence, no change is
necessary. Anyway I just committed OP's patch (with a minor comment
tweak).
> One other thing I forgot to mention was contiguous memory. The
> page- by-page allocation in aac has another benefit, and that's to
> not tax contigmalloc with finding 256KB of contiguous memory.
> That's not a big deal at boot, but is a problem if you load the
> driver after the system has been running for a while. It's
> immensely useful during development, but it's never been clear to
> me how useful it is in real life.
Thanks for your review and comments!
Jung-uk Kim
More information about the freebsd-current
mailing list