recent bge(4) changes causing problems

Pyun YongHyeon pyunyh at gmail.com
Mon Oct 11 23:17:49 UTC 2010


On Mon, Oct 11, 2010 at 03:53:31PM -0700, Steve Kargl wrote:
> It seems recent changes to the bge driver are causing
> some problems with my hardware where the watchdog is
> now timing out.
> 
> /var/log/messages contains
> 
> 14:23:14 kernel: SMP: AP CPU #1 Launched!
> 14:23:14 kernel: Trying to mount root from ufs:/dev/ad6s1a
> 14:23:15 kernel: bge1: link state changed to UP
> 14:23:15 lpd[1190]: lpd startup: logging=0
> 14:23:15 ntpd[1224]: ntpd 4.2.4p5-a (1)
> 14:23:15 kernel: bge0: link state changed to UP
> 14:23:24 ntpd[1225]: time reset -0.677316 s
> 14:23:24 ntpd[1225]: kernel time sync status change 2001
> 14:31:01 kernel: bge0: watchdog timeout -- resetting
> 14:31:01 kernel: bge0: link state changed to DOWN
> 14:31:02 kernel: Limiting icmp unreach response from 613 to 200 packets/sec
> 14:31:04 ntpd[1225]: sendto(140.142.2.8) (fd=22): No route to host
> 14:31:04 kernel: bge0: link state changed to UP
> 14:31:30 kernel: Limiting icmp unreach response from 205 to 200 packets/sec
> 14:31:31 kernel: Limiting icmp unreach response from 203 to 200 packets/sec
> 15:40:11 su: kargl to root on /dev/pts/0
> 15:40:35 kernel: bge0: link state changed to DOWN
> 15:40:38 kernel: bge0: link state changed to UP
> 
> The last 2 bge messages are from me manually using 
> ifconfig to bring my net connect back to life.
> 
> troutmask:kargl[206] sysctl -a | grep bge.0
> dev.bge.0.%desc: Broadcom Gigabit Ethernet Controller, ASIC rev. 0x002100
> dev.bge.0.%driver: bge
> dev.bge.0.%location: slot=9 function=0 handle=\_SB_.PCI0.GOLA.GLAN
> dev.bge.0.%pnpinfo: vendor=0x14e4 device=0x1648 subvendor=0x14e4 subdevice=0x1644 class=0x020000
> dev.bge.0.%parent: pci2
> dev.bge.0.forced_collapse: 0
> dev.bge.0.forced_udpcsum: 0
> dev.bge.0.stats.FramesDroppedDueToFilters: 0
> dev.bge.0.stats.DmaWriteQueueFull: 0
> dev.bge.0.stats.DmaWriteHighPriQueueFull: 0
> dev.bge.0.stats.NoMoreRxBDs: 0
> dev.bge.0.stats.InputDiscards: 0
> dev.bge.0.stats.InputErrors: 0
> dev.bge.0.stats.RecvThresholdHit: 325
> dev.bge.0.stats.DmaReadQueueFull: 0
> dev.bge.0.stats.DmaReadHighPriQueueFull: 0
> dev.bge.0.stats.SendDataCompQueueFull: 0
> dev.bge.0.stats.RingSetSendProdIndex: 469
> dev.bge.0.stats.RingStatusUpdate: 330
> dev.bge.0.stats.Interrupts: 330
> dev.bge.0.stats.AvoidedInterrupts: 0
> dev.bge.0.stats.SendThresholdHit: 0
> dev.bge.0.stats.rx.ifHCInOctets: 569452
> dev.bge.0.stats.rx.Fragments: 0
> dev.bge.0.stats.rx.UnicastPkts: 497
> dev.bge.0.stats.rx.MulticastPkts: 1
> dev.bge.0.stats.rx.FCSErrors: 0
> dev.bge.0.stats.rx.AlignmentErrors: 0
> dev.bge.0.stats.rx.xonPauseFramesReceived: 0
> dev.bge.0.stats.rx.xoffPauseFramesReceived: 0
> dev.bge.0.stats.rx.ControlFramesReceived: 0
> dev.bge.0.stats.rx.xoffStateEntered: 0
> dev.bge.0.stats.rx.FramesTooLong: 0
> dev.bge.0.stats.rx.Jabbers: 0
> dev.bge.0.stats.rx.UndersizePkts: 0
> dev.bge.0.stats.rx.inRangeLengthError: 0
> dev.bge.0.stats.rx.outRangeLengthError: 0
> dev.bge.0.stats.tx.ifHCOutOctets: 39056
> dev.bge.0.stats.tx.Collisions: 0
> dev.bge.0.stats.tx.XonSent: 0
> dev.bge.0.stats.tx.XoffSent: 0
> dev.bge.0.stats.tx.flowControlDone: 0
> dev.bge.0.stats.tx.InternalMacTransmitErrors: 0
> dev.bge.0.stats.tx.SingleCollisionFrames: 0
> dev.bge.0.stats.tx.MultipleCollisionFrames: 0
> dev.bge.0.stats.tx.DeferredTransmissions: 0
> dev.bge.0.stats.tx.ExcessiveCollisions: 0
> dev.bge.0.stats.tx.LateCollisions: 0
> dev.bge.0.stats.tx.UnicastPkts: 468
> dev.bge.0.stats.tx.MulticastPkts: 0
> dev.bge.0.stats.tx.BroadcastPkts: 1
> dev.bge.0.stats.tx.CarrierSenseErrors: 0
> dev.bge.0.stats.tx.Discards: 0
> dev.bge.0.stats.tx.Errors: 0
> dev.bge.0.wake: 0
> 
> In the time that it's taken me to compose this message
> the timeout has fire again.
> 
> 15:47:01 kernel: Limiting icmp unreach response from 662 to 200 packets/sec
> 15:47:02 kernel: Limiting icmp unreach response from 446 to 200 packets/sec
> 15:47:03 kernel: Limiting icmp unreach response from 436 to 200 packets/sec
> 15:47:04 kernel: Limiting icmp unreach response from 464 to 200 packets/sec
> 15:47:05 kernel: Limiting icmp unreach response from 438 to 200 packets/sec
> 15:47:06 kernel: Limiting icmp unreach response from 445 to 200 packets/sec
> 15:47:07 kernel: bge0: watchdog timeout -- resetting
> 15:47:07 kernel: bge0: link state changed to DOWN
> 15:47:07 kernel: Limiting icmp unreach response from 439 to 200 packets/sec
> 15:47:08 kernel: Limiting icmp unreach response from 330 to 200 packets/sec
> 15:47:11 kernel: bge0: link state changed to UP
> 15:47:12 kernel: Limiting icmp unreach response from 214 to 200 packets/sec
> 15:47:13 kernel: Limiting icmp unreach response from 202 to 200 packets/sec
> 15:47:14 kernel: Limiting icmp unreach response from 238 to 200 packets/sec
> 15:49:42 kernel: bge0: link state changed to DOWN
> 15:49:44 kernel: bge0: link state changed to UP
> 
> I not seen these icmp unreach response messages.
> 

The icmp unreach has nothing to do with bge(4). Check whether a
server that listens on an UDP port is still alive on your box.
What worries me is bge(4) watchdog timeouts. It looks like your
controller is BCM5704. I also have bge(4) regression report from
marius on sparc64. He said r213945 seemed to cause the issue and
I'm working on the issue. Could you also try the attached patch?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bge.rxprod.patch
Type: text/x-diff
Size: 875 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20101011/718291cb/bge.rxprod.bin


More information about the freebsd-current mailing list