new sk driver [was: nve timeout (and down) regression?]
Pieter de Goeje
pieter at degoeje.nl
Tue Mar 28 14:38:23 UTC 2006
On Tuesday 28 March 2006 12:40, you wrote:
<snip>
> probably you do not have the traffic to make the box crash or less then
> 1/2GB of RAM in use
The box has 1GB RAM. Traffic is approx. 2-3Mbit/s.
>
> in fact the problem does not happen on UP machines, only some times a
> device timeout which only ocasionally cause rx/tx to stop
>
> The problem is appearing on SMP machines
>
> when you have less then 2Gb of RAM the problem ocurres once a day or so and
> seems to depend on memory use and amount of traffic
>
> soon the traffic reaches more than 1Mbit/s the crash is predictable and you
> can wait to see
The box has actually crashed once, but I am not sure it was because of the
NIC.
~> uptime
4:19PM up 3 days, 9:59, 1 user, load averages: 1.38, 1.20, 1.03
>
> on 4GB of Ram machines and more traffic the crash is imediatly and worse
> when the box crashed under load (4-6Mbit/s) and comes back then the high
> demand strokes it and it crashes in minutes or imediatly soon the network
> is up
>
> so probably mpsafenet may help by not processing concurrent packets but
> this is a workaround not a solution (for me)
Agreed.
>
> last time I checked mpsafenet=0 almost cut 1Mbit/s of traffic and the
> overall performance/response was bad, higher HZ did not resolved anything
> and disabling polling made it still worse (I have other NICs installed),
> the machines are working as GW
I can't really tell if the performance is impaired by mpsafenet=0, because the
box is mostly busy doing userland stuff. Typical traffic looks like this:
~> netstat -w 1
input (Total) output
packets errs bytes packets errs bytes colls
1186 0 97134 1302 0 276430 0
1206 0 97484 1382 0 264315 0
1193 0 97048 1366 0 278901 0
1198 0 98251 1403 0 273428 0
1205 0 99283 1393 0 270364 0
1162 0 94746 1376 0 265909 0
1162 0 93011 1420 0 258514 0
1187 0 94366 1467 0 263162 0
1178 0 93441 1441 0 248875 0
1176 0 93116 1484 0 266285 0
1146 0 91615 1424 0 256180 0
1222 0 96597 1560 0 432862 0
1222 0 93796 1591 0 444466 0
This is all UDP. The traffic generates around 2000 interrupts/sec on sk.
>
> until january the machines didn't crashed, only timeouts and rx/tx stops
> I used Pyun's driver and the timeouts went away, thank's again!
>
> so then I got confused by some if_sk talks on stable and thought the driver
> was comitted and the boxes started crashing until I got it last week and
> reused Pyun's driver again and my sk problems are gone again, the machines
> are stable for 4/5 days now
I'm going to test the new driver to see if I can disable mpsafenet. To be
specific on the NIC:
skc0 at pci0:10:0: class=0x020000 card=0x811a1043 chip=0x432011ab rev=0x13
hdr=0x00
vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
device = '88E8001/8003/8010 Gigabit Ethernet Controller with Integrated
PHY (copper)'
class = network
subclass = ethernet
Pieter de Goeje
More information about the freebsd-stable
mailing list