regression: msk0 watchdog timeout and interrupt storm
Yonghyeon PYUN
pyunyh at gmail.com
Tue Jan 7 08:49:52 UTC 2014
On Mon, Jan 06, 2014 at 10:20:40AM -0500, Curtis Villamizar wrote:
>
[...]
> Here are some relevant parts of dmesg. Is there anything else you want?
>
> real memory = 2147483648 (2048 MB)
> avail memory = 2061438976 (1965 MB)
> Event timer "LAPIC" quality 400
> ACPI APIC Table: <LENOVO TC-9I >
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
> cpu0 (BSP): APIC ID: 0
> cpu1 (AP): APIC ID: 1
>
> pcib2: <ACPI PCI-PCI bridge> irq 19 at device 7.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> on pci1
> pcib2: <ACPI PCI-PCI bridge> irq 19 at device 7.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> mskc0: <Marvell Yukon 88E8057 Gigabit Ethernet> port 0xe800-0xe8ff mem
> 0xfebfc000-0xfebfffff irq 19 at device 0.0 on pci2
> msk0: <Marvell Technology Group Ltd. Yukon Ultra 2 Id 0xba Rev 0x00>
> on mskc0
> msk0: Ethernet address: c8:9c:dc:56:38:ef
> miibus0: <MII bus> on msk0
> e1000phy0: <Marvell 88E1149 Gigabit PHY> PHY 0 on miibus0
> e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
> 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master,
> auto, auto-flow
>
Thank you for the info.
> The computer is a Lenovo ThinkCenter (small tower) and not an uncommon
> machine so others are likely to run into this.
>
> > > Please let me know what I could do to help debug this.
> > >
> >
> > If you have more than 4GB memory, try reducing the amount of
> > memory(e.g. 3G) in /boot/loader.conf and let me know whether that
> > makes any difference for you.
> > Note, in order to test this you have to back out your local
> > changes.
>
> Only have 2 GB memory.
>
Ok, that means my wild guess was not right. :-(
[...]
> > I'm under the impression that the controller may have additional
> > DMA addressing limitation where TX/RX and status LEs should have
> > the same high DMA address. Due to the lack of documentation I'm
> > not sure about that. If the issue does not happen with 3GB memory,
> > we have to use 32bit DMA addressing.
>
> We have 2 GB memory so the problem with the original code does happen
> with less than 4 GB memory. Everything has the same high address of
> zero.
>
Right.
> Is there anything else you want me to try?
msk(4) uses 4KB alignment for status/TX/RX rings. Your local change
will reduce the number of status LEs to be 1024. Stock msk(4) will
use 2048 entries for status LEs and you said the cons variable is
stuck with 1024 in this case. I have no idea this can happen at
this moment.
Did msk(4) ever work on your box? If the answer is yes, would you
back out both r258780 and your local change?
I have a small local diff which was made after seeing r258780. But
I'm not sure whether it makes any difference.
>
> Curtis
>
> btw - I added someone from Marvell on the Bcc in case he wants to join
> in on the conversation or give us a hint in private email.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msk.type.diff
Type: text/x-diff
Size: 907 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140107/0f36a8d8/attachment.diff>
More information about the freebsd-stable
mailing list