Odd network issue ... *very* slow scp between two servers
Marc G. Fournier
scrappy at hub.org
Sat Mar 6 11:05:55 PST 2004
I have two servers on the same network switch, sitting one on top of the
other ... one is running an em (Dual-Xeon 2.4Ghz) device, the other an fxp
(Dual-PIII 1.3Ghz) device ...
Doing a straight (not sftp/scp) ftp between the two servers, of a 1Meg
file, shows:
1038785 bytes received in 85.91 seconds (11.81 KB/s)
Going between two servers, same switch, both running fxp devices, for the
exact same file, shows:
1038785 bytes received in 0.09 seconds (10.64 MB/s)
Now, I have ipaudit running on all the servers, to monitor bandwidth ...
the server with the fxp device on it, that I just downloaded to from
another fxp server @ 10.64MB/s, did 11535.73M of traffic total yesterday
... the one with the em device did 11766.46M ...
Now, in my /var/log/messages file, I am getting the RST lines:
Mar 6 12:35:38 neptune /kernel: Limiting open port RST response from 700 to 200 packets per second
Mar 6 12:35:39 neptune /kernel: Limiting open port RST response from 636 to 200 packets per second
Mar 6 12:35:41 neptune /kernel: Limiting open port RST response from 523 to 200 packets per second
Mar 6 12:35:46 neptune /kernel: Limiting open port RST response from 386 to 200 packets per second
Mar 6 12:35:55 neptune /kernel: Limiting open port RST response from 238 to 200 packets per second
Mar 6 13:34:25 neptune /kernel: Limiting open port RST response from 799 to 200 packets per second
Mar 6 13:34:27 neptune /kernel: Limiting open port RST response from 637 to 200 packets per second
Mar 6 13:34:28 neptune /kernel: Limiting open port RST response from 503 to 200 packets per second
Mar 6 13:34:32 neptune /kernel: Limiting open port RST response from 343 to 200 packets per second
Mar 6 13:34:42 neptune /kernel: Limiting open port RST response from 206 to 200 packets per second
And seems to be quite regular:
neptune# gzcat /var/log/messages.0.gz | grep RST | wc -l
95
where 0.gz is from Mar 5 14:47:28 -> Mar 6 11:30:52
but, shouldn't:
net.inet.tcp.blackhole: 0 -> 2
help? or did I read the man page wrong? If it should, I'm still only
getting ~13k/s on that same file ...
there is nothing else in messages to indicate a problem, either with
processes, or drives, or anything, and load on the machine, right now, is
only 1.3 ...
vmstat -i shows a high rate of interrupts for the em device:
neptune# uptime
1:43PM up 57 days, 3:08, 5 users, load averages: 1.38, 1.32, 0.97
neptune# vmstat -i
interrupt total rate
ahd0 irq16 15 0
ahd1 irq17 932228686 188
em0 irq18 1205773331 244
clk irq0 493596903 99
rtc irq8 631819522 128
Total 3263418457 661
vs
mars# uptime
1:43PM up 77 days, 9:50, 3 users, load averages: 7.44, 7.73, 6.28
mars# vmstat -i
interrupt total rate
fxp0 irq5 499794285 74
ahc0 irq11 15 0
ahc1 irq15 915710622 136
fdc0 irq6 4 0
clk irq0 668800403 99
rtc irq8 856196939 128
Total 2940502268 439
the fxp device is running:
media: Ethernet autoselect (100baseTX <full-duplex>)
the em device is running:
media: Ethernet 100baseTX <full-duplex>
and, finally, the em server was last upgraded:
4.9-STABLE #4: Tue Jan 6 00:59:37 AST 2004
while the fxp server is almost ancient:
4.9-PRERELEASE #2: Sat Sep 20 14:42:25 ADT 2003
I'm going to do a reboot on the server Monday, when a tech is easily
accessible in case of a problem ... but, before I do that, is there
anything I can do to possible debug this? Maybe something I can look at
that would show a 'leak', maybe?
Thanks ...
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy at hub.org Yahoo!: yscrappy ICQ: 7615664
More information about the freebsd-net
mailing list