amd64/134757: 32 bit processes on 64 bit platforms occasionally
drop core with bad ds reg
Stephen Sanders
ssanders at opnet.com
Wed May 20 16:00:13 UTC 2009
>Number: 134757
>Category: amd64
>Synopsis: 32 bit processes on 64 bit platforms occasionally drop core with bad ds reg
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: freebsd-amd64
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Wed May 20 16:00:12 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator: Stephen Sanders
>Release: 6.3 Release amd64
>Organization:
OPNET
>Environment:
FreeBSD alt-4100-2.lab.opnet.com 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Tue Mar 31 14:11:07 PDT 2009 pmai at focus7.networkphysics.com:/u1/builds/ping/NP/FreeBSD/package/NPbabkernel/bld-tmp/sys/amd64/compile/NPBAB amd64
>Description:
With fair regularity, we have 32 bit processes dropping core on 64 bit systems. In particular perl and bash.
Our system is definitely a hybrid but that aspect appears to not be the issue. The system works properly more than not.
I have attached a file containing 2 gdb sessions. One session is looking at a core that bash left behind and the other is looking at a bash session with no core.
In the core case, the core drop is occurring when the instruction "cmpl $0x0,,0x80d41d4" is executed. Checking the registers, one will see that
ss == cs == ds == es == fs == gs == 0x23 35
In the non-core case, I halted execution on execute_command() and found that
ss == 0x23 35
ds == es == fs == gs == 0x0
This sound suspiciously like a bug that was fixed in 7.1. I believe the issue was in in cpuswitch.S.
Porting the 32 bit processes up to 64 bits is not currently an option for a solution.
>How-To-Repeat:
Fork a 32 bit process on a 64 bit 6.3 FBSD machine often and long enough. Something like once a minute.
Alternatively, fork a large number of 32 bit processes at boot time.
>Fix:
None.
Patch attached with submission follows:
The following is gdb output from debugging a /usr/local/bin/bash
core drop. Note that ds == ss.
==============================================================================
(gdb)
(gdb) bt
#0 0x080759fe in kill_pid ()
#1 0x08074dc8 in wait_for ()
#2 0x08067d18 in execute_command_internal ()
#3 0x08068af1 in execute_command_internal ()
#4 0x08068cb3 in execute_command_internal ()
#5 0x08067ef5 in execute_command_internal ()
#6 0x08097317 in parse_and_execute ()
#7 0x0807cadc in command_substitute ()
#8 0x080801bc in pat_subst ()
#9 0x0807a6cc in cond_expand_word ()
#10 0x0807a76d in cond_expand_word ()
#11 0x0807a7c6 in expand_string_unsplit ()
#12 0x0807a44c in string_rest_of_args ()
#13 0x08079e7c in strip_trailing_ifs_whitespace ()
#14 0x0807a018 in do_assignment ()
#15 0x0808175e in expand_words_shellexp ()
#16 0x080811a4 in expand_words ()
#17 0x0806a95c in execute_command_internal ()
#18 0x08067cab in execute_command_internal ()
#19 0x08067796 in execute_command ()
#20 0x08068c63 in execute_command_internal ()
#21 0x08067ef5 in execute_command_internal ()
#22 0x08067796 in execute_command ()
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) info frame
Stack level 0, frame at 0xffffca40:
eip = 0x80759fe in kill_pid; saved eip 0x8074dc8
called by frame at 0xffffca90
Arglist at 0xffffca38, args:
Locals at 0xffffca38, Previous frame's sp is 0xffffca40
Saved registers:
ebx at 0xffffca2c, ebp at 0xffffca38, esi at 0xffffca30, edi at 0xffffca34,
eip at 0xffffca3c
(gdb) disassemble 0x80759fe
Dump of assembler code for function kill_pid:
0x0807579c <kill_pid+0>: push %ebp
0x0807579d <kill_pid+1>: mov %esp,%ebp
0x0807579f <kill_pid+3>: push %edi
0x080757a0 <kill_pid+4>: push %esi
0x080757a1 <kill_pid+5>: push %ebx
0x080757a2 <kill_pid+6>: sub $0x3c,%esp
0x080757a5 <kill_pid+9>: mov 0xc(%ebp),%edi
0x080757a8 <kill_pid+12>: movl $0x0,0xffffffc0(%ebp)
0x080757af <kill_pid+19>: cmpl $0x0,0x10(%ebp)
0x080757b3 <kill_pid+23>: je 0x8075948 <kill_pid+428>
0x080757b9 <kill_pid+29>: sub $0xc,%esp
0x080757bc <kill_pid+32>: lea 0xffffffd8(%ebp),%esi
0x080757bf <kill_pid+35>: push %esi
0x080757c0 <kill_pid+36>: call 0x8059908 <_init+316>
0x080757c5 <kill_pid+41>: add $0x8,%esp
0x080757c8 <kill_pid+44>: push $0x14
0x080757ca <kill_pid+46>: push %esi
0x080757cb <kill_pid+47>: call 0x8059b78 <_init+940>
0x080757d0 <kill_pid+52>: lea 0xffffffc8(%ebp),%ebx
0x080757d3 <kill_pid+55>: mov %ebx,(%esp)
0x080757d6 <kill_pid+58>: call 0x8059908 <_init+316>
0x080757db <kill_pid+63>: add $0xc,%esp
---Type <return> to continue, or q <return> to quit---
0x080757de <kill_pid+66>: push %ebx
0x080757df <kill_pid+67>: push %esi
0x080757e0 <kill_pid+68>: push $0x1
0x080757e2 <kill_pid+70>: call 0x805a2b8 <close+112>
0x080757e7 <kill_pid+75>: add $0xc,%esp
0x080757ea <kill_pid+78>: lea 0xffffffc4(%ebp),%eax
0x080757ed <kill_pid+81>: push %eax
0x080757ee <kill_pid+82>: push $0x0
0x080757f0 <kill_pid+84>: pushl 0x8(%ebp)
0x080757f3 <kill_pid+87>: call 0x8073d4c <kill_current_pipeline+20>
0x080757f8 <kill_pid+92>: mov %eax,%ebx
0x080757fa <kill_pid+94>: add $0x10,%esp
0x080757fd <kill_pid+97>: cmpl $0xffffffff,0xffffffc4(%ebp)
0x08075801 <kill_pid+101>: je 0x807591c <kill_pid+384>
0x08075807 <kill_pid+107>: mov 0xffffffc4(%ebp),%edx
0x0807580a <kill_pid+110>: mov 0x80d3d60,%eax
0x0807580f <kill_pid+115>: mov (%eax,%edx,4),%eax
0x08075812 <kill_pid+118>: andl $0xfffffffd,0x10(%eax)
0x08075816 <kill_pid+122>: mov 0xffffffc4(%ebp),%edx
0x08075819 <kill_pid+125>: mov 0x80d3d60,%eax
0x0807581e <kill_pid+130>: mov (%eax,%edx,4),%edx
0x08075821 <kill_pid+133>: mov 0x8(%edx),%eax
0x08075824 <kill_pid+136>: cmp 0x80cd618,%eax
---Type <return> to continue, or q <return> to quit---
0x0807582a <kill_pid+142>: jne 0x8075878 <kill_pid+220>
0x0807582c <kill_pid+144>: mov 0x4(%edx),%ebx
0x0807582f <kill_pid+147>: nop
0x08075830 <kill_pid+148>: sub $0x8,%esp
0x08075833 <kill_pid+151>: push %edi
0x08075834 <kill_pid+152>: pushl 0x4(%ebx)
0x08075837 <kill_pid+155>: call 0x8059ca8 <_init+1244>
0x0807583c <kill_pid+160>: add $0x10,%esp
0x0807583f <kill_pid+163>: cmpl $0x0,0xc(%ebx)
0x08075843 <kill_pid+167>: jne 0x8075860 <kill_pid+196>
0x08075845 <kill_pid+169>: cmp $0xf,%edi
0x08075848 <kill_pid+172>: je 0x807584f <kill_pid+179>
0x0807584a <kill_pid+174>: cmp $0x1,%edi
0x0807584d <kill_pid+177>: jne 0x8075860 <kill_pid+196>
0x0807584f <kill_pid+179>: sub $0x8,%esp
0x08075852 <kill_pid+182>: push $0x13
0x08075854 <kill_pid+184>: pushl 0x4(%ebx)
0x08075857 <kill_pid+187>: call 0x8059ca8 <_init+1244>
0x0807585c <kill_pid+192>: add $0x10,%esp
0x0807585f <kill_pid+195>: nop
0x08075860 <kill_pid+196>: mov (%ebx),%ebx
0x08075862 <kill_pid+198>: mov 0xffffffc4(%ebp),%eax
0x08075865 <kill_pid+201>: mov 0x80d3d60,%edx
---Type <return> to continue, or q <return> to quit---
0x0807586b <kill_pid+207>: mov (%edx,%eax,4),%eax
0x0807586e <kill_pid+210>: cmp %ebx,0x4(%eax)
0x08075871 <kill_pid+213>: jne 0x8075830 <kill_pid+148>
0x08075873 <kill_pid+215>: jmp 0x8075930 <kill_pid+404>
0x08075878 <kill_pid+220>: sub $0x8,%esp
0x0807587b <kill_pid+223>: push %edi
0x0807587c <kill_pid+224>: mov 0xffffffc4(%ebp),%eax
0x0807587f <kill_pid+227>: mov 0x80d3d60,%edx
0x08075885 <kill_pid+233>: mov (%edx,%eax,4),%eax
0x08075888 <kill_pid+236>: pushl 0x8(%eax)
0x0807588b <kill_pid+239>: call 0x8059e88 <unlink+160>
0x08075890 <kill_pid+244>: mov %eax,0xffffffc0(%ebp)
0x08075893 <kill_pid+247>: add $0x10,%esp
0x08075896 <kill_pid+250>: test %ebx,%ebx
0x08075898 <kill_pid+252>: je 0x8075930 <kill_pid+404>
0x0807589e <kill_pid+258>: mov 0xffffffc4(%ebp),%eax
0x080758a1 <kill_pid+261>: mov 0x80d3d60,%edx
0x080758a7 <kill_pid+267>: mov (%edx,%eax,4),%eax
0x080758aa <kill_pid+270>: cmpl $0x1,0xc(%eax)
0x080758ae <kill_pid+274>: jne 0x80758d6 <kill_pid+314>
0x080758b0 <kill_pid+276>: cmp $0xf,%edi
0x080758b3 <kill_pid+279>: je 0x80758ba <kill_pid+286>
0x080758b5 <kill_pid+281>: cmp $0x1,%edi
---Type <return> to continue, or q <return> to quit---
0x080758b8 <kill_pid+284>: jne 0x80758d6 <kill_pid+314>
0x080758ba <kill_pid+286>: sub $0x8,%esp
0x080758bd <kill_pid+289>: push $0x13
0x080758bf <kill_pid+291>: mov 0xffffffc4(%ebp),%eax
0x080758c2 <kill_pid+294>: mov 0x80d3d60,%edx
0x080758c8 <kill_pid+300>: mov (%edx,%eax,4),%eax
0x080758cb <kill_pid+303>: pushl 0x8(%eax)
0x080758ce <kill_pid+306>: call 0x8059e88 <unlink+160>
0x080758d3 <kill_pid+311>: add $0x10,%esp
0x080758d6 <kill_pid+314>: test %ebx,%ebx
0x080758d8 <kill_pid+316>: je 0x8075930 <kill_pid+404>
0x080758da <kill_pid+318>: mov 0xffffffc4(%ebp),%edx
0x080758dd <kill_pid+321>: mov 0x80d3d60,%eax
0x080758e2 <kill_pid+326>: mov (%eax,%edx,4),%eax
0x080758e5 <kill_pid+329>: cmpl $0x1,0xc(%eax)
0x080758e9 <kill_pid+333>: jne 0x8075930 <kill_pid+404>
0x080758eb <kill_pid+335>: cmp $0x13,%edi
0x080758ee <kill_pid+338>: jne 0x8075930 <kill_pid+404>
0x080758f0 <kill_pid+340>: push %edx
0x080758f1 <kill_pid+341>: call 0x807543c <reap_dead_jobs+580>
0x080758f6 <kill_pid+346>: mov 0xffffffc4(%ebp),%edx
0x080758f9 <kill_pid+349>: mov 0x80d3d60,%eax
0x080758fe <kill_pid+354>: mov (%eax,%edx,4),%eax
---Type <return> to continue, or q <return> to quit---
0x08075901 <kill_pid+357>: andl $0xfffffffe,0x10(%eax)
0x08075905 <kill_pid+361>: mov 0xffffffc4(%ebp),%edx
0x08075908 <kill_pid+364>: mov 0x80d3d60,%eax
0x0807590d <kill_pid+369>: mov (%eax,%edx,4),%eax
0x08075910 <kill_pid+372>: orl $0x2,0x10(%eax)
0x08075914 <kill_pid+376>: add $0x4,%esp
0x08075917 <kill_pid+379>: jmp 0x8075930 <kill_pid+404>
0x08075919 <kill_pid+381>: lea 0x0(%esi),%esi
0x0807591c <kill_pid+384>: sub $0x8,%esp
0x0807591f <kill_pid+387>: push %edi
0x08075920 <kill_pid+388>: pushl 0x8(%ebp)
0x08075923 <kill_pid+391>: call 0x8059e88 <unlink+160>
0x08075928 <kill_pid+396>: mov %eax,0xffffffc0(%ebp)
0x0807592b <kill_pid+399>: add $0x10,%esp
0x0807592e <kill_pid+402>: mov %esi,%esi
0x08075930 <kill_pid+404>: sub $0x4,%esp
0x08075933 <kill_pid+407>: push $0x0
0x08075935 <kill_pid+409>: lea 0xffffffc8(%ebp),%eax
0x08075938 <kill_pid+412>: push %eax
0x08075939 <kill_pid+413>: push $0x3
0x0807593b <kill_pid+415>: call 0x805a2b8 <close+112>
0x08075940 <kill_pid+420>: add $0x10,%esp
0x08075943 <kill_pid+423>: jmp 0x807595a <kill_pid+446>
---Type <return> to continue, or q <return> to quit---
0x08075945 <kill_pid+425>: lea 0x0(%esi),%esi
0x08075948 <kill_pid+428>: sub $0x8,%esp
0x0807594b <kill_pid+431>: push %edi
0x0807594c <kill_pid+432>: pushl 0x8(%ebp)
0x0807594f <kill_pid+435>: call 0x8059ca8 <_init+1244>
0x08075954 <kill_pid+440>: mov %eax,0xffffffc0(%ebp)
0x08075957 <kill_pid+443>: add $0x10,%esp
0x0807595a <kill_pid+446>: mov 0xffffffc0(%ebp),%eax
0x0807595d <kill_pid+449>: lea 0xfffffff4(%ebp),%esp
0x08075960 <kill_pid+452>: pop %ebx
0x08075961 <kill_pid+453>: pop %esi
0x08075962 <kill_pid+454>: pop %edi
0x08075963 <kill_pid+455>: leave
0x08075964 <kill_pid+456>: ret
0x08075965 <kill_pid+457>: lea 0x0(%esi),%esi
0x08075968 <kill_pid+460>: push %ebp
0x08075969 <kill_pid+461>: mov %esp,%ebp
0x0807596b <kill_pid+463>: push %ebx
0x0807596c <kill_pid+464>: sub $0x4,%esp
0x0807596f <kill_pid+467>: call 0x8059d58 <_init+1420>
0x08075974 <kill_pid+472>: mov (%eax),%ebx
0x08075976 <kill_pid+474>: incl 0x80d41d4
0x0807597c <kill_pid+480>: cmpl $0x0,0x80d41d8
---Type <return> to continue, or q <return> to quit---
0x08075983 <kill_pid+487>: jne 0x8075994 <kill_pid+504>
0x08075985 <kill_pid+489>: sub $0x8,%esp
0x08075988 <kill_pid+492>: push $0x0
0x0807598a <kill_pid+494>: push $0xffffffff
0x0807598c <kill_pid+496>: call 0x80759a0 <kill_pid+516>
0x08075991 <kill_pid+501>: add $0x10,%esp
0x08075994 <kill_pid+504>: call 0x8059d58 <_init+1420>
0x08075999 <kill_pid+509>: mov %ebx,(%eax)
0x0807599b <kill_pid+511>: mov 0xfffffffc(%ebp),%ebx
0x0807599e <kill_pid+514>: leave
0x0807599f <kill_pid+515>: ret
0x080759a0 <kill_pid+516>: push %ebp
0x080759a1 <kill_pid+517>: mov %esp,%ebp
0x080759a3 <kill_pid+519>: push %edi
0x080759a4 <kill_pid+520>: push %esi
0x080759a5 <kill_pid+521>: push %ebx
0x080759a6 <kill_pid+522>: sub $0x1c,%esp
0x080759a9 <kill_pid+525>: mov $0x0,%edi
0x080759ae <kill_pid+530>: movl $0x0,0xffffffe8(%ebp)
0x080759b5 <kill_pid+537>: movl $0xffffffff,0xffffffe4(%ebp)
0x080759bc <kill_pid+544>: cmpl $0x0,0x80cd634
0x080759c3 <kill_pid+551>: je 0x80759d3 <kill_pid+567>
0x080759c5 <kill_pid+553>: mov $0x6,%esi
---Type <return> to continue, or q <return> to quit---
0x080759ca <kill_pid+558>: cmpl $0x0,0x80d5454
0x080759d1 <kill_pid+565>: je 0x80759d8 <kill_pid+572>
0x080759d3 <kill_pid+567>: mov $0x0,%esi
0x080759d8 <kill_pid+572>: cmpl $0x0,0x80d41d4
0x080759df <kill_pid+579>: jne 0x80759e7 <kill_pid+587>
0x080759e1 <kill_pid+581>: cmpl $0x0,0xc(%ebp)
0x080759e5 <kill_pid+585>: jne 0x80759ea <kill_pid+590>
0x080759e7 <kill_pid+587>: or $0x1,%esi
0x080759ea <kill_pid+590>: sub $0x4,%esp
0x080759ed <kill_pid+593>: push %esi
0x080759ee <kill_pid+594>: lea 0xfffffff0(%ebp),%eax
0x080759f1 <kill_pid+597>: push %eax
0x080759f2 <kill_pid+598>: push $0xffffffff
0x080759f4 <kill_pid+600>: call 0x8059838 <_init+108>
0x080759f9 <kill_pid+605>: mov %eax,%ebx
0x080759fb <kill_pid+607>: add $0x10,%esp
0x080759fe <kill_pid+610>: cmpl $0x0,0x80d41d4
0x08075a05 <kill_pid+617>: jle 0x8075a15 <kill_pid+633>
(gdb) info registers
eax 0x369 873
ecx 0x0 0
edx 0x0 0
ebx 0x369 873
esp 0xffffca10 0xffffca10
ebp 0xffffca38 0xffffca38
esi 0x0 0
edi 0x0 0
eip 0x80759fe 0x80759fe
eflags 0x10282 66178
cs 0x1b 27
ss 0x23 35
ds 0x23 35
es 0x23 35
fs 0x23 35
gs 0x23 35
===============================================================================
The following is a gdb session that is not a core drop. Here ds != ss
===============================================================================
alt-4100-2[1036] # gdb /usr/local/bin/bash
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
(gdb) break execute_command
Breakpoint 1 at 0x8067755
(gdb) run
Starting program: /usr/local/bin/bash
bash-3.00# ls
Breakpoint 1, 0x08067755 in execute_command ()
(gdb) bt
#0 0x08067755 in execute_command ()
#1 0x0805c749 in reader_loop ()
#2 0x0805aba0 in main ()
(gdb) info frame
Stack level 0, frame at 0xffffdc00:
eip = 0x8067755 in execute_command; saved eip 0x805c749
called by frame at 0xffffdc30
Arglist at 0xffffdbf8, args:
Locals at 0xffffdbf8, Previous frame's sp is 0xffffdc00
Saved registers:
ebx at 0xffffdbf0, ebp at 0xffffdbf8, esi at 0xffffdbf4, eip at 0xffffdbfc
(gdb) info registers
eax 0x80dd5a0 135124384
ecx 0x0 0
edx 0xffffd400 -11264
ebx 0x0 0
esp 0xffffdbf0 0xffffdbf0
ebp 0xffffdbf8 0xffffdbf8
esi 0x0 0
edi 0xffffdcd4 -9004
eip 0x8067755 0x8067755
eflags 0x292 658
cs 0x1b 27
ss 0x23 35
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-amd64
mailing list