cvs commit: src/sys/kern kern_intr.c src/sys/sys interrupt.h
John Baldwin
jhb at freebsd.org
Wed May 2 17:28:41 UTC 2007
On Wednesday 02 May 2007 12:36:57 pm Nate Lawson wrote:
> Nate Lawson wrote:
> > John Baldwin wrote:
> >> On Wednesday 02 May 2007 03:07:07 am Darren Reed wrote:
> >>> On Wed, May 02, 2007 at 06:15:13AM +0000, Nate Lawson wrote:
> >>>> njl 2007-05-02 06:15:13 UTC
> >>>>
> >>>> FreeBSD src repository
> >>>>
> >>>> Modified files: (Branch: RELENG_6)
> >>>> sys/kern kern_intr.c
> >>>> sys/sys interrupt.h
> >>>> Log:
> >>>> MFC: rate-check the interrupt storm message and bump the counter
500 ->
> >> 1000
> >>> Is this number, "500" or "1000" somehow "magical" for modern hardware?
> >>>
> >>> If I had a 500MHZ, 1GHz, 1.5GHz, 2GHz, 2.5GHz machines, each with the
> >>> appropriate architecture, what would the correct value for this be?
> >>> Is i always 1000 or should it be calculated?
> >> It's a SWAG and tunable for machines where it doesn't work. In practice
the
> >> old setting seemed to be a bit too trigger-happy as I know my printer
always
> >> triggered it, for example.
> >>
> >
> > There's more to it than just your Ghz number. It's a counter of the
> > number of times an interrupt has triggered while the previous one was
> > being serviced. The faster your kernel, the lower the number could be.
> >
> > I have a slow early SMP Celeron system with a dc(4) adapter with 4 ports
> > sharing an irq with my ata. At 3 am, the nightly script kicks off
> > enough IO that it triggers a bug in my dc(4) card that causes it to mask
> > the interrupt too long. Then, the irq storm suppression logic kicked
> > in, causing ata to timeout the request. The drive is on a mirror so I'd
> > lose half the mirror, then rebuild in the morning. With this value
> > bumped, I don't have that problem any more but the real issue is why
> > dc(4) is being so quirky under heavy shared irq load.
> >
>
> This is on 6.x btw. Is there any reason why our retries is so low?
>
> sys/dev/ata/ata-disk.c: request->retries = 2;
At work we up the timeout from 5 to 30, but we leave retries at 2.
> Note that I still got a timeout but it succeeded without error. I think
> this is a combination of the dc(4) and highpoint hpt366 driver
> interaction. dc(4) is probably holding Giant or something too long and
> ata is being too sensitive to the slow hw.
Neither dc(4) nor ata(4) hold Giant, FWIW.
--
John Baldwin
More information about the cvs-src
mailing list