Watchdog timeout em driver 8.2-R
Lars Wilke
lw at lwilke.de
Wed Apr 18 18:15:14 UTC 2012
Hi,
i first posted the following to the -stable list but got no
reply. Maybe someone here has some advice for me.
Switch: HP ProCurve 2910al
The switch does passive LACP
Motherboard: Supermicro X8DTN+-F
NIC: Quad Port Card, i.e. em1:
em1 at pci0:6:0:1: class=0x020000 card=0x125e15d9 chip=0x105e8086 rev=0x06 hdr=0x00
vendor = 'Intel Corporation'
device = 'HP NC360T PCIe DP Gigabit Server Adapter (n1e5132)'
class = network
subclass = ethernet
bar [10] = type Memory, range 32, base 0xfb9e0000, size 131072, enabled
bar [14] = type Memory, range 32, base 0xfb9c0000, size 131072, enabled
bar [18] = type I/O Port, range 32, base 0xcc00, size 32, enabled
cap 01[c8] = powerspec 2 supports D0 D3 current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[e0] = PCI-Express 1 endpoint max data 256(256) link x4(x4)
ecap 0001[100] = AER 1 0 fatal 1 non-fatal 0 corrected
ecap 0003[140] = Serial 1 002590ffff0484d8
I use CAT 6 cables and the switch and server are in the same cabinet.
OS: FBSD is 8.2-Release
rc.conf:
ifconfig_em0="up"
ifconfig_em1="up"
ifconfig_em2="up"
ifconfig_em3="up"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto lacp laggport em0 laggport em1 laggport em2 laggport em3"
ipv4_addrs_lagg0="192.168.80.20/24"
Hm, what sysctls might be interesting?
I use:
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendspace=65536
net.inet.tcp.recvspace=131072
kern.ipc.nmbclusters=230400
kern.maxvnodes=250000
kern.maxfiles=65536
kern.maxfilesperproc=32768
vfs.read_max=32
loader.conf: does only contain stuff concerning zfs
Except for swap the whole system uses zfs, swap is on a geom mirror.
Once in a while i see this messages in /var/log/messages
Apr 13 08:53:07 san02 kernel: em1: Watchdog timeout -- resetting
Apr 13 08:53:07 san02 kernel: em1: Queue(0) tdh = 232, hw tdt = 190
Apr 13 08:53:07 san02 kernel: em1: TX(0) desc avail = 31,Next TX to
Clean = 221
Apr 13 08:53:07 san02 kernel: em1: Link is Down
Apr 13 08:53:07 san02 kernel: em1: link state changed to DOWN
Sometimes nothing for days, sometimes under high Network load (NFSv3), sometimes
multiple times a day. I see this message/behaviour on always the same two of the
four interfaces (em1 and em3).
Then the NIC does not have the ACTIVE flag anymore, an ifconfig em1 up
solves the issue. But why does it loose the ACTIVE state and why does the
NIC reset itself in the first place?
On the switch i see that the port matching em1 on the server has left
the trunk, so the missing ACTIVE flag is not lying 8-/
Googling found many postings with the same problem and one site suggested
that this might be an ACPI problem but nothing concrete and the postings
i found were mostly FBSD7 and older.
Any pointers would be appreciated.
Thank you
--lars
_______________________________________________
freebsd-stable at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list