amd64/180885: panic: kmem_map too small at heavy packet traffic

Sat Jul 27 07:30:02 UTC 2013

>Number:         180885
>Category:       amd64
>Synopsis:       panic: kmem_map too small at heavy packet traffic
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Jul 27 07:30:01 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     tugrul
>Release:        10.0/Head
>Organization:
>Environment:
FreeBSD test 10.0-CURRENT FreeBSD 10.0-CURRENT #0: Thu Jul 18 17:07:53 EEST 2013     root at test:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
I am using 10.0-CURRENT on Intel(R) Xeon(R) E5620 with 16GB ram. I am taking
"panic: kmem_malloc(-548663296): kmem_map too small: 539459584 total allocated"
message with configuration below:
[root@ ~]# sysctl vm.kmem_size_min vm.kmem_size_max vm.kmem_size vm.kmem_size_scale
vm.kmem_size_min: 0
vm.kmem_size_max: 329853485875
vm.kmem_size: 16686845952
vm.kmem_size_scale: 1
[root@ ~]# sysctl hw.physmem hw.usermem hw.realmem
hw.physmem: 17151787008
hw.usermem: 8282652672
hw.realmem: 18253611008
[root@ ~]# sysctl hw.pagesize hw.pagesizes hw.availpages
hw.pagesize: 4096
hw.pagesizes: 4096 2097152 0
hw.availpages: 4187448

When I compare vmstat and netstat output of boot time result and subsequent result, the major difference are seemed at: 
pf_temp 0 0K - 79309736 128 | pf_temp 1077640 134705K - 84330076 128
and after the panic at the core dump file the major vmstat difference is:
temp 110 15K - 76212305 16,32,64,128,256 | temp 117 6742215K - 655115 16,32,64,128,2
Specifically, I am taking this panic when doing ip spoof attack while syn-proxy activated. The output of system arguments below:

kern.malloc_count: 315
vm.md_malloc_wait: 0
vfs.bufmallocspace: 0
vfs.maxmallocbufspace: 86269952
vm.kmem_size: 16686845952
vm.kmem_size_min: 0
vm.kmem_size_max: 329853485875
vm.kmem_size_scale: 1
vm.kmem_map_size: 543973376
vm.kmem_map_free: 15974895616
kern.maxvnodes: 350097
kern.minvnodes: 87524
vfs.numvnodes: 112329
vfs.wantfreevnodes: 87524
vfs.freevnodes: 87502

[root@ ~]# pfctl -si
No ALTQ support in kernel
ALTQ related functions disabled
Status: Enabled for 0 days 00:17:39           Debug: Urgent

State Table                          Total             Rate
  current entries                  5142886
  searches                        26982141        25478.9/s
  inserts                         29055053        27436.3/s
  removals                        24218654        22869.4/s
Counters
  match                           24901305        23514.0/s
  bad-offset                             0            0.0/s
  fragment                               0            0.0/s
  short                                  0            0.0/s
  normalize                              0            0.0/s
  memory                                 0            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                             18            0.0/s
  proto-cksum                            0            0.0/s
  state-mismatch                         0            0.0/s
  state-insert                           0            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                        29378439        27741.7/s

[root@ ~]# panic: kmem_malloc(-1 814 425 600): kmem_map too small: 543 956 992 to
tal allocated
cpuid = 8
Uptime: 1d18h2m14s
(ada0:ahcich1:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
(ada0:ahcich1:0:0:0): CAM status: CCB request is in progress
(ada0:ahcich1:0:0:0): Error 5, Retries exhausted
(ada0:ahcich1:0:0:0): Synchronize cache failed
(ada1:ahcich2:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
(ada1:ahcich2:0:0:0): CAM status: CCB request is in progress
(ada1:ahcich2:0:0:0): Error 5, Retries exhausted
(ada1:ahcich2:0:0:0): Synchronize cache failed
Dumping 8243 out of 16357 MB:..1%..11%

When I explore the source code of kernel (at vm_kern.c and vm_map.c), I see that the panic can occur with the cases at below:

* negative malloc size parameter

* longer than free buffer respect to kmem_map min_offset and max_offset values

* try to allocate when the root entry of map is the rightmost entry of map

* try to allocate bigger than map's max_free value

I think the panic occurs at mbuf creation process when calling malloc() as a result of couldn't be able to allocate memory; but I don't understand why one of this panic case activating? The memory is almost empty but the device is saying kmem_map small when using about 0.5GB memory purely. 

The negative written value is directly malloc's size parameter (in fact after some page size alignment enlargements operation). This parameter has been defined as "unsigned long" but printing with "%ld" as signed long. So if the size is very very big (more than 2^63 at amd64), the signed printing can remark the first bit of size as sign bit then write the - sign. But I think the size can not be so big, for this reason this should be bug and there must be a problem (the size parameter can come as negative or the enlargement functions can destroy the size parameter). 
>How-To-Repeat:
At every ip-spoof attack when pf syn-proxy activated. At the same time; at each 2 second interval, I am taking some information with "pfctl -ss" by cron while the attack is continuing. After 15 second the panic occurs.
>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted: