em interrupt storm
Michael Vince
mv at roq.com
Thu Nov 24 01:40:27 GMT 2005
Kris Kennaway wrote:
>On Tue, Nov 22, 2005 at 08:54:49PM -0800, John Polstra wrote:
>
>
>>On 23-Nov-2005 Kris Kennaway wrote:
>>
>>
>>>I am seeing the em driver undergoing an interrupt storm whenever the
>>>amr driver receives interrupts. In this case I was running newfs on
>>>the amr array and em0 was not in use:
>>>
>>> 28 root 1 -68 -187 0K 8K CPU1 1 0:32 53.98% irq16: em0
>>> 36 root 1 -64 -183 0K 8K RUN 1 0:37 27.75% irq24: amr0
>>>
>>># vmstat -i
>>>interrupt total rate
>>>irq1: atkbd0 2 0
>>>irq4: sio0 199 1
>>>irq6: fdc0 32 0
>>>irq13: npx0 1 0
>>>irq14: ata0 47 0
>>>irq15: ata1 931 5
>>>irq16: em0 6321801 37187
>>>irq24: amr0 28023 164
>>>cpu0: timer 337533 1985
>>>cpu1: timer 337285 1984
>>>Total 7025854 41328
>>>
>>>When newfs finished (i.e. amr was idle), em0 stopped storming.
>>>
>>>MPTable: <INTEL SE7520BD22 >
>>>
>>>
>>This is the dreaded interrupt aliasing problem that several of us have
>>experienced with this chipset. High-numbered interrupts alias down to
>>interrupts in the range 16..19 (or maybe 16..23), a multiple of 8 less
>>than the original interupt.
>>
>>Nobody knows what causes it, and nobody knows how to fix it.
>>
>>
>
>This would be good to document somewhere so that people don't either
>accidentally buy this hardware, or know what to expect when they run
>it.
>
>Kris
>
>
This is Intels latest server chipset designs and Dell are putting that
chipset in all their servers.
Luckily I haven't not seen the problem on any of my Dell servers (as
long as I am looking at this right).
This server has been running for a long time.
vmstat -i
interrupt total rate
irq1: atkbd0 6 0
irq4: sio0 23433 0
irq6: fdc0 10 0
irq8: rtc 2631238611 128
irq13: npx0 1 0
irq14: ata0 99 0
irq16: uhci0 1507608958 73
irq18: uhci2 42005524 2
irq19: uhci1 3 0
irq23: atapci0 151 0
irq46: amr0 41344088 2
irq64: em0 1513106157 73
irq0: clk 2055605782 99
Total 7790932823 379
This one just transfered over 8gigs of data in 77seconds with around
1000 simultaneous tcp connections under a load of 35. Both seem OK.
vmstat -i
interrupt total rate
irq4: sio0 315 0
irq13: npx0 1 0
irq14: ata0 47 0
irq16: uhci0 2894669 2
irq18: uhci2 977413 0
irq23: ehci0 3 0
irq46: amr0 883138 0
irq64: em0 2890414 2
cpu0: timer 2763566717 1999
cpu3: timer 2763797300 1999
cpu1: timer 2763551479 1999
cpu2: timer 2763797870 1999
Total 11062359366 8004
Mike
More information about the freebsd-net
mailing list