amd64/134757: 32 bit processes on 64 bit platforms occasionally drop core with bad ds reg

Stephen Sanders ssanders at opnet.com
Wed May 20 16:00:13 UTC 2009


>Number:         134757
>Category:       amd64
>Synopsis:       32 bit processes on 64 bit platforms occasionally drop core with bad  ds reg
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed May 20 16:00:12 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Stephen Sanders
>Release:        6.3 Release amd64
>Organization:
OPNET
>Environment:
FreeBSD alt-4100-2.lab.opnet.com 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Tue Mar 31 14:11:07 PDT 2009     pmai at focus7.networkphysics.com:/u1/builds/ping/NP/FreeBSD/package/NPbabkernel/bld-tmp/sys/amd64/compile/NPBAB  amd64
>Description:
With fair regularity, we have 32 bit processes dropping core on 64 bit systems.  In particular perl and bash.

Our system is definitely a hybrid but that aspect appears to not be the issue.  The system works properly more than not.

I have attached a file containing 2 gdb sessions. One session is looking at a core that bash left behind and the other is looking at a bash session with no core.

In the core case, the core drop is occurring when the instruction "cmpl $0x0,,0x80d41d4" is executed.  Checking the registers, one will see that 

ss == cs == ds == es == fs == gs == 0x23 35

In the non-core case, I halted execution on execute_command() and found that 

ss == 0x23 35

ds == es == fs == gs == 0x0

This sound suspiciously like a bug that was fixed in 7.1.  I believe the issue was in in cpuswitch.S.

Porting the 32 bit processes up to 64 bits is not currently an option for a solution.
>How-To-Repeat:
Fork a 32 bit process on a 64 bit 6.3 FBSD machine often and long enough.  Something like once a minute.

Alternatively, fork a large number of 32 bit processes at boot time.

>Fix:
None.

Patch attached with submission follows:

The following is gdb output from debugging a /usr/local/bin/bash 
core drop. Note that ds == ss.
==============================================================================
(gdb) 
(gdb) bt
#0  0x080759fe in kill_pid ()
#1  0x08074dc8 in wait_for ()
#2  0x08067d18 in execute_command_internal ()
#3  0x08068af1 in execute_command_internal ()
#4  0x08068cb3 in execute_command_internal ()
#5  0x08067ef5 in execute_command_internal ()
#6  0x08097317 in parse_and_execute ()
#7  0x0807cadc in command_substitute ()
#8  0x080801bc in pat_subst ()
#9  0x0807a6cc in cond_expand_word ()
#10 0x0807a76d in cond_expand_word ()
#11 0x0807a7c6 in expand_string_unsplit ()
#12 0x0807a44c in string_rest_of_args ()
#13 0x08079e7c in strip_trailing_ifs_whitespace ()
#14 0x0807a018 in do_assignment ()
#15 0x0808175e in expand_words_shellexp ()
#16 0x080811a4 in expand_words ()
#17 0x0806a95c in execute_command_internal ()
#18 0x08067cab in execute_command_internal ()
#19 0x08067796 in execute_command ()
#20 0x08068c63 in execute_command_internal ()
#21 0x08067ef5 in execute_command_internal ()
#22 0x08067796 in execute_command ()
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) info frame
Stack level 0, frame at 0xffffca40:
 eip = 0x80759fe in kill_pid; saved eip 0x8074dc8
 called by frame at 0xffffca90
 Arglist at 0xffffca38, args: 
 Locals at 0xffffca38, Previous frame's sp is 0xffffca40
 Saved registers:
  ebx at 0xffffca2c, ebp at 0xffffca38, esi at 0xffffca30, edi at 0xffffca34,
  eip at 0xffffca3c
(gdb) disassemble 0x80759fe
Dump of assembler code for function kill_pid:
0x0807579c <kill_pid+0>:	push   %ebp
0x0807579d <kill_pid+1>:	mov    %esp,%ebp
0x0807579f <kill_pid+3>:	push   %edi
0x080757a0 <kill_pid+4>:	push   %esi
0x080757a1 <kill_pid+5>:	push   %ebx
0x080757a2 <kill_pid+6>:	sub    $0x3c,%esp
0x080757a5 <kill_pid+9>:	mov    0xc(%ebp),%edi
0x080757a8 <kill_pid+12>:	movl   $0x0,0xffffffc0(%ebp)
0x080757af <kill_pid+19>:	cmpl   $0x0,0x10(%ebp)
0x080757b3 <kill_pid+23>:	je     0x8075948 <kill_pid+428>
0x080757b9 <kill_pid+29>:	sub    $0xc,%esp
0x080757bc <kill_pid+32>:	lea    0xffffffd8(%ebp),%esi
0x080757bf <kill_pid+35>:	push   %esi
0x080757c0 <kill_pid+36>:	call   0x8059908 <_init+316>
0x080757c5 <kill_pid+41>:	add    $0x8,%esp
0x080757c8 <kill_pid+44>:	push   $0x14
0x080757ca <kill_pid+46>:	push   %esi
0x080757cb <kill_pid+47>:	call   0x8059b78 <_init+940>
0x080757d0 <kill_pid+52>:	lea    0xffffffc8(%ebp),%ebx
0x080757d3 <kill_pid+55>:	mov    %ebx,(%esp)
0x080757d6 <kill_pid+58>:	call   0x8059908 <_init+316>
0x080757db <kill_pid+63>:	add    $0xc,%esp
---Type <return> to continue, or q <return> to quit--- 
0x080757de <kill_pid+66>:	push   %ebx
0x080757df <kill_pid+67>:	push   %esi
0x080757e0 <kill_pid+68>:	push   $0x1
0x080757e2 <kill_pid+70>:	call   0x805a2b8 <close+112>
0x080757e7 <kill_pid+75>:	add    $0xc,%esp
0x080757ea <kill_pid+78>:	lea    0xffffffc4(%ebp),%eax
0x080757ed <kill_pid+81>:	push   %eax
0x080757ee <kill_pid+82>:	push   $0x0
0x080757f0 <kill_pid+84>:	pushl  0x8(%ebp)
0x080757f3 <kill_pid+87>:	call   0x8073d4c <kill_current_pipeline+20>
0x080757f8 <kill_pid+92>:	mov    %eax,%ebx
0x080757fa <kill_pid+94>:	add    $0x10,%esp
0x080757fd <kill_pid+97>:	cmpl   $0xffffffff,0xffffffc4(%ebp)
0x08075801 <kill_pid+101>:	je     0x807591c <kill_pid+384>
0x08075807 <kill_pid+107>:	mov    0xffffffc4(%ebp),%edx
0x0807580a <kill_pid+110>:	mov    0x80d3d60,%eax
0x0807580f <kill_pid+115>:	mov    (%eax,%edx,4),%eax
0x08075812 <kill_pid+118>:	andl   $0xfffffffd,0x10(%eax)
0x08075816 <kill_pid+122>:	mov    0xffffffc4(%ebp),%edx
0x08075819 <kill_pid+125>:	mov    0x80d3d60,%eax
0x0807581e <kill_pid+130>:	mov    (%eax,%edx,4),%edx
0x08075821 <kill_pid+133>:	mov    0x8(%edx),%eax
0x08075824 <kill_pid+136>:	cmp    0x80cd618,%eax
---Type <return> to continue, or q <return> to quit---      
0x0807582a <kill_pid+142>:	jne    0x8075878 <kill_pid+220>
0x0807582c <kill_pid+144>:	mov    0x4(%edx),%ebx
0x0807582f <kill_pid+147>:	nop    
0x08075830 <kill_pid+148>:	sub    $0x8,%esp
0x08075833 <kill_pid+151>:	push   %edi
0x08075834 <kill_pid+152>:	pushl  0x4(%ebx)
0x08075837 <kill_pid+155>:	call   0x8059ca8 <_init+1244>
0x0807583c <kill_pid+160>:	add    $0x10,%esp
0x0807583f <kill_pid+163>:	cmpl   $0x0,0xc(%ebx)
0x08075843 <kill_pid+167>:	jne    0x8075860 <kill_pid+196>
0x08075845 <kill_pid+169>:	cmp    $0xf,%edi
0x08075848 <kill_pid+172>:	je     0x807584f <kill_pid+179>
0x0807584a <kill_pid+174>:	cmp    $0x1,%edi
0x0807584d <kill_pid+177>:	jne    0x8075860 <kill_pid+196>
0x0807584f <kill_pid+179>:	sub    $0x8,%esp
0x08075852 <kill_pid+182>:	push   $0x13
0x08075854 <kill_pid+184>:	pushl  0x4(%ebx)
0x08075857 <kill_pid+187>:	call   0x8059ca8 <_init+1244>
0x0807585c <kill_pid+192>:	add    $0x10,%esp
0x0807585f <kill_pid+195>:	nop    
0x08075860 <kill_pid+196>:	mov    (%ebx),%ebx
0x08075862 <kill_pid+198>:	mov    0xffffffc4(%ebp),%eax
0x08075865 <kill_pid+201>:	mov    0x80d3d60,%edx
---Type <return> to continue, or q <return> to quit---
0x0807586b <kill_pid+207>:	mov    (%edx,%eax,4),%eax
0x0807586e <kill_pid+210>:	cmp    %ebx,0x4(%eax)
0x08075871 <kill_pid+213>:	jne    0x8075830 <kill_pid+148>
0x08075873 <kill_pid+215>:	jmp    0x8075930 <kill_pid+404>
0x08075878 <kill_pid+220>:	sub    $0x8,%esp
0x0807587b <kill_pid+223>:	push   %edi
0x0807587c <kill_pid+224>:	mov    0xffffffc4(%ebp),%eax
0x0807587f <kill_pid+227>:	mov    0x80d3d60,%edx
0x08075885 <kill_pid+233>:	mov    (%edx,%eax,4),%eax
0x08075888 <kill_pid+236>:	pushl  0x8(%eax)
0x0807588b <kill_pid+239>:	call   0x8059e88 <unlink+160>
0x08075890 <kill_pid+244>:	mov    %eax,0xffffffc0(%ebp)
0x08075893 <kill_pid+247>:	add    $0x10,%esp
0x08075896 <kill_pid+250>:	test   %ebx,%ebx
0x08075898 <kill_pid+252>:	je     0x8075930 <kill_pid+404>
0x0807589e <kill_pid+258>:	mov    0xffffffc4(%ebp),%eax
0x080758a1 <kill_pid+261>:	mov    0x80d3d60,%edx
0x080758a7 <kill_pid+267>:	mov    (%edx,%eax,4),%eax
0x080758aa <kill_pid+270>:	cmpl   $0x1,0xc(%eax)
0x080758ae <kill_pid+274>:	jne    0x80758d6 <kill_pid+314>
0x080758b0 <kill_pid+276>:	cmp    $0xf,%edi
0x080758b3 <kill_pid+279>:	je     0x80758ba <kill_pid+286>
0x080758b5 <kill_pid+281>:	cmp    $0x1,%edi
---Type <return> to continue, or q <return> to quit---
0x080758b8 <kill_pid+284>:	jne    0x80758d6 <kill_pid+314>
0x080758ba <kill_pid+286>:	sub    $0x8,%esp
0x080758bd <kill_pid+289>:	push   $0x13
0x080758bf <kill_pid+291>:	mov    0xffffffc4(%ebp),%eax
0x080758c2 <kill_pid+294>:	mov    0x80d3d60,%edx
0x080758c8 <kill_pid+300>:	mov    (%edx,%eax,4),%eax
0x080758cb <kill_pid+303>:	pushl  0x8(%eax)
0x080758ce <kill_pid+306>:	call   0x8059e88 <unlink+160>
0x080758d3 <kill_pid+311>:	add    $0x10,%esp
0x080758d6 <kill_pid+314>:	test   %ebx,%ebx
0x080758d8 <kill_pid+316>:	je     0x8075930 <kill_pid+404>
0x080758da <kill_pid+318>:	mov    0xffffffc4(%ebp),%edx
0x080758dd <kill_pid+321>:	mov    0x80d3d60,%eax
0x080758e2 <kill_pid+326>:	mov    (%eax,%edx,4),%eax
0x080758e5 <kill_pid+329>:	cmpl   $0x1,0xc(%eax)
0x080758e9 <kill_pid+333>:	jne    0x8075930 <kill_pid+404>
0x080758eb <kill_pid+335>:	cmp    $0x13,%edi
0x080758ee <kill_pid+338>:	jne    0x8075930 <kill_pid+404>
0x080758f0 <kill_pid+340>:	push   %edx
0x080758f1 <kill_pid+341>:	call   0x807543c <reap_dead_jobs+580>
0x080758f6 <kill_pid+346>:	mov    0xffffffc4(%ebp),%edx
0x080758f9 <kill_pid+349>:	mov    0x80d3d60,%eax
0x080758fe <kill_pid+354>:	mov    (%eax,%edx,4),%eax
---Type <return> to continue, or q <return> to quit---
0x08075901 <kill_pid+357>:	andl   $0xfffffffe,0x10(%eax)
0x08075905 <kill_pid+361>:	mov    0xffffffc4(%ebp),%edx
0x08075908 <kill_pid+364>:	mov    0x80d3d60,%eax
0x0807590d <kill_pid+369>:	mov    (%eax,%edx,4),%eax
0x08075910 <kill_pid+372>:	orl    $0x2,0x10(%eax)
0x08075914 <kill_pid+376>:	add    $0x4,%esp
0x08075917 <kill_pid+379>:	jmp    0x8075930 <kill_pid+404>
0x08075919 <kill_pid+381>:	lea    0x0(%esi),%esi
0x0807591c <kill_pid+384>:	sub    $0x8,%esp
0x0807591f <kill_pid+387>:	push   %edi
0x08075920 <kill_pid+388>:	pushl  0x8(%ebp)
0x08075923 <kill_pid+391>:	call   0x8059e88 <unlink+160>
0x08075928 <kill_pid+396>:	mov    %eax,0xffffffc0(%ebp)
0x0807592b <kill_pid+399>:	add    $0x10,%esp
0x0807592e <kill_pid+402>:	mov    %esi,%esi
0x08075930 <kill_pid+404>:	sub    $0x4,%esp
0x08075933 <kill_pid+407>:	push   $0x0
0x08075935 <kill_pid+409>:	lea    0xffffffc8(%ebp),%eax
0x08075938 <kill_pid+412>:	push   %eax
0x08075939 <kill_pid+413>:	push   $0x3
0x0807593b <kill_pid+415>:	call   0x805a2b8 <close+112>
0x08075940 <kill_pid+420>:	add    $0x10,%esp
0x08075943 <kill_pid+423>:	jmp    0x807595a <kill_pid+446>
---Type <return> to continue, or q <return> to quit---
0x08075945 <kill_pid+425>:	lea    0x0(%esi),%esi
0x08075948 <kill_pid+428>:	sub    $0x8,%esp
0x0807594b <kill_pid+431>:	push   %edi
0x0807594c <kill_pid+432>:	pushl  0x8(%ebp)
0x0807594f <kill_pid+435>:	call   0x8059ca8 <_init+1244>
0x08075954 <kill_pid+440>:	mov    %eax,0xffffffc0(%ebp)
0x08075957 <kill_pid+443>:	add    $0x10,%esp
0x0807595a <kill_pid+446>:	mov    0xffffffc0(%ebp),%eax
0x0807595d <kill_pid+449>:	lea    0xfffffff4(%ebp),%esp
0x08075960 <kill_pid+452>:	pop    %ebx
0x08075961 <kill_pid+453>:	pop    %esi
0x08075962 <kill_pid+454>:	pop    %edi
0x08075963 <kill_pid+455>:	leave  
0x08075964 <kill_pid+456>:	ret    
0x08075965 <kill_pid+457>:	lea    0x0(%esi),%esi
0x08075968 <kill_pid+460>:	push   %ebp
0x08075969 <kill_pid+461>:	mov    %esp,%ebp
0x0807596b <kill_pid+463>:	push   %ebx
0x0807596c <kill_pid+464>:	sub    $0x4,%esp
0x0807596f <kill_pid+467>:	call   0x8059d58 <_init+1420>
0x08075974 <kill_pid+472>:	mov    (%eax),%ebx
0x08075976 <kill_pid+474>:	incl   0x80d41d4
0x0807597c <kill_pid+480>:	cmpl   $0x0,0x80d41d8
---Type <return> to continue, or q <return> to quit---
0x08075983 <kill_pid+487>:	jne    0x8075994 <kill_pid+504>
0x08075985 <kill_pid+489>:	sub    $0x8,%esp
0x08075988 <kill_pid+492>:	push   $0x0
0x0807598a <kill_pid+494>:	push   $0xffffffff
0x0807598c <kill_pid+496>:	call   0x80759a0 <kill_pid+516>
0x08075991 <kill_pid+501>:	add    $0x10,%esp
0x08075994 <kill_pid+504>:	call   0x8059d58 <_init+1420>
0x08075999 <kill_pid+509>:	mov    %ebx,(%eax)
0x0807599b <kill_pid+511>:	mov    0xfffffffc(%ebp),%ebx
0x0807599e <kill_pid+514>:	leave  
0x0807599f <kill_pid+515>:	ret    
0x080759a0 <kill_pid+516>:	push   %ebp
0x080759a1 <kill_pid+517>:	mov    %esp,%ebp
0x080759a3 <kill_pid+519>:	push   %edi
0x080759a4 <kill_pid+520>:	push   %esi
0x080759a5 <kill_pid+521>:	push   %ebx
0x080759a6 <kill_pid+522>:	sub    $0x1c,%esp
0x080759a9 <kill_pid+525>:	mov    $0x0,%edi
0x080759ae <kill_pid+530>:	movl   $0x0,0xffffffe8(%ebp)
0x080759b5 <kill_pid+537>:	movl   $0xffffffff,0xffffffe4(%ebp)
0x080759bc <kill_pid+544>:	cmpl   $0x0,0x80cd634
0x080759c3 <kill_pid+551>:	je     0x80759d3 <kill_pid+567>
0x080759c5 <kill_pid+553>:	mov    $0x6,%esi
---Type <return> to continue, or q <return> to quit---
0x080759ca <kill_pid+558>:	cmpl   $0x0,0x80d5454
0x080759d1 <kill_pid+565>:	je     0x80759d8 <kill_pid+572>
0x080759d3 <kill_pid+567>:	mov    $0x0,%esi
0x080759d8 <kill_pid+572>:	cmpl   $0x0,0x80d41d4
0x080759df <kill_pid+579>:	jne    0x80759e7 <kill_pid+587>
0x080759e1 <kill_pid+581>:	cmpl   $0x0,0xc(%ebp)
0x080759e5 <kill_pid+585>:	jne    0x80759ea <kill_pid+590>
0x080759e7 <kill_pid+587>:	or     $0x1,%esi
0x080759ea <kill_pid+590>:	sub    $0x4,%esp
0x080759ed <kill_pid+593>:	push   %esi
0x080759ee <kill_pid+594>:	lea    0xfffffff0(%ebp),%eax
0x080759f1 <kill_pid+597>:	push   %eax
0x080759f2 <kill_pid+598>:	push   $0xffffffff
0x080759f4 <kill_pid+600>:	call   0x8059838 <_init+108>
0x080759f9 <kill_pid+605>:	mov    %eax,%ebx
0x080759fb <kill_pid+607>:	add    $0x10,%esp
0x080759fe <kill_pid+610>:	cmpl   $0x0,0x80d41d4
0x08075a05 <kill_pid+617>:	jle    0x8075a15 <kill_pid+633>

(gdb) info registers
eax            0x369	873
ecx            0x0	0
edx            0x0	0
ebx            0x369	873
esp            0xffffca10	0xffffca10
ebp            0xffffca38	0xffffca38
esi            0x0	0
edi            0x0	0
eip            0x80759fe	0x80759fe
eflags         0x10282	66178
cs             0x1b	27
ss             0x23	35
ds             0x23	35
es             0x23	35
fs             0x23	35
gs             0x23	35

===============================================================================
The following is a gdb session that is not a core drop.  Here ds != ss
===============================================================================

alt-4100-2[1036] # gdb /usr/local/bin/bash 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
(gdb) break execute_command
Breakpoint 1 at 0x8067755
(gdb) run
Starting program: /usr/local/bin/bash 
bash-3.00# ls

Breakpoint 1, 0x08067755 in execute_command ()
(gdb) bt
#0  0x08067755 in execute_command ()
#1  0x0805c749 in reader_loop ()
#2  0x0805aba0 in main ()
(gdb) info frame
Stack level 0, frame at 0xffffdc00:
 eip = 0x8067755 in execute_command; saved eip 0x805c749
 called by frame at 0xffffdc30
 Arglist at 0xffffdbf8, args: 
 Locals at 0xffffdbf8, Previous frame's sp is 0xffffdc00
 Saved registers:
  ebx at 0xffffdbf0, ebp at 0xffffdbf8, esi at 0xffffdbf4, eip at 0xffffdbfc
(gdb) info registers
eax            0x80dd5a0	135124384
ecx            0x0	0
edx            0xffffd400	-11264
ebx            0x0	0
esp            0xffffdbf0	0xffffdbf0
ebp            0xffffdbf8	0xffffdbf8
esi            0x0	0
edi            0xffffdcd4	-9004
eip            0x8067755	0x8067755
eflags         0x292	658
cs             0x1b	27
ss             0x23	35
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0
(gdb) 



>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-amd64 mailing list