amd64/186114: MPD5.7 umtxn
Yury
hawk256 at yandex.ru
Sun Jan 26 02:20:01 UTC 2014
>Number: 186114
>Category: amd64
>Synopsis: MPD5.7 umtxn
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-amd64
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: update
>Submitter-Id: current-users
>Arrival-Date: Sun Jan 26 02:20:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator: Yury
>Release: FreeBSD 10.0
>Organization:
GreenLine
>Environment:
FreeBSD gw01.comteks.biz 10.0-STABLE FreeBSD 10.0-STABLE #0 r261173: Sun Jan 26 03:58:09 MSK 2014 hawk at gw01.comteks.biz:/usr/obj/usr/src/sys/Hawk amd64
>Description:
I have BRAS on FreeBSD. It was 9.2 STABLE. I tried to update it up to 10.0 RELEASE, later tried to STABLE. On both variants I have the same problem.
Some time after start, around 5 minutes, it works normally. But after 100-150 users have connected trough PPPoE (MPD5.7) MPD process stops in state umtxn.
Of course, no one can connect after that. But who have already connected keeping work.
last pid: 17712; load averages: 1.16, 0.65, 0.27 up 0+00:01:51 05:28:23
50 processes: 1 running, 49 sleeping
CPU: 0.0% user, 0.0% nice, 1.0% system, 0.9% interrupt, 98.1% idle
Mem: 1162M Active, 56M Inact, 400M Wired, 145M Buf, 2274M Free
Swap: 4096M Total, 4096M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
2535 root 1 20 0 201M 184M select 3 1:14 10.69% zebra
2476 _pflogd 1 20 0 14600K 2200K bpf 0 0:12 0.00% pflogd
2541 root 1 20 0 224M 206M select 2 0:07 0.00% bgpd
9803 root 1 20 0 78624K 44092K select 2 0:02 0.00% bsnmpd
3462 root 3 20 0 56736K 9164K umtxn 0 0:01 0.00% mpd5
7243 mysql 17 32 0 6958M 636M uwait 1 0:01 0.00% mysqld
6095 bind 7 20 0 129M 76864K kqread 1 0:01 0.00% named
3872 root 1 20 0 61124K 6808K select 1 0:00 0.00% nmbd
8644 root 3 20 0 47332K 6216K select 1 0:00 0.00% utm5_rfw
procstat -k 3462
PID TID COMM TDNAME KSTACK
3462 100113 mpd5 - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_lock_umutex __umtx_op_wait_umutex amd64_syscall Xfast_syscall
3462 100115 mpd5 - mi_switch sleepq_catch_signals sleepq_wait_sig _cv_wait_sig seltdwait sys_poll amd64_syscall Xfast_syscall
3462 100512 mpd5 - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_lock_umutex __umtx_op_wait_umutex amd64_syscall Xfast_syscall
/var/log/mpd.log
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPCP: Up event
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPCP: state change Starting --> Req-Sent
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPCP: SendConfigReq #1
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPADDR 10.10.0.1
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPV6CP: Up event
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPV6CP: state change Starting --> Req-Sent
Jan 26 05:28:13 gw01 mpd: [B_ppp-46] IPV6CP: SendConfigReq #1
Jan 26 05:28:13 gw01 mpd: [vlan6-107] LCP: rec'd Terminate Request #240 (Opened)
Jan 26 05:28:13 gw01 mpd: [vlan6-107] LCP: state change Opened --> Stopping
Jan 26 05:28:13 gw01 mpd: [vlan6-107] Link: Leave bundle "B_ppp-46"
It always stops with the same 3 last strings.
Jan 26 05:52:38 gw01 kernel: sonewconn: pcb 0xfffff80007757c40: Listen queue overflow:
4 already in queue awaiting acceptance
Jan 26 05:53:09 gw01 last message repeated 60 times
Jan 26 05:53:34 gw01 last message repeated 51 times
Kernel conf:
GENERIC +
device ipmi
device coretemp
device smbus
device lagg
device netmap
options IPI_PREEMPTION
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPDIVERT
options DUMMYNET
options IPFIREWALL_NAT
options LIBALIAS
device pf
device pflog
device pfsync
options ALTQ
options ALTQ_CBQ # Class Bases Queuing (CBQ)
options ALTQ_RED # Random Early Detection (RED)
options ALTQ_RIO # RED In/Out
options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC)
options ALTQ_PRIQ # Priority Queuing (PRIQ)
options ALTQ_NOPCC # Required for SMP build
options NETGRAPH
options NETGRAPH_BPF
options NETGRAPH_CAR
options NETGRAPH_ETHER
options NETGRAPH_IPFW
options NETGRAPH_IFACE
options NETGRAPH_KSOCKET
options NETGRAPH_PPP
options NETGRAPH_PPTPGRE
options NETGRAPH_PPPOE
options NETGRAPH_SOCKET
options NETGRAPH_TCPMSS
options NETGRAPH_TEE
options NETGRAPH_VJC
options NETGRAPH_MPPC_ENCRYPTION
options NETGRAPH_NETFLOW
CPU: Intel(R) Xeon(R) CPU X3470 @ 2.93GHz (2933.36-MHz K8-class CPU)
Origin = "GenuineIntel" Id = 0x106e5 Family = 0x6 Model = 0x1e Stepping = 5
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT>
AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
AMD Features2=0x1<LAHF>
TSC: P-state invariant, performance statistics
real memory = 4294967296 (4096 MB)
avail memory = 4052344832 (3864 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <INTEL S3420GPC>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 2
cpu2 (AP): APIC ID: 4
cpu3 (AP): APIC ID: 6
I tried to get ktrace dump. But I could not open it.
ktrdump: kvm_nlist: No such file or directory
I think, It is something wrong with netgraph system.
>How-To-Repeat:
Update to FreeBSD 10.0 and try to connect 100-150 users.
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-amd64
mailing list