cvs commit: src/include _ctype.h
Bruce Evans
brde at optusnet.com.au
Thu Nov 1 07:21:21 PDT 2007
On Thu, 1 Nov 2007, Christoph Mallon wrote:
> Andrey Chernov wrote:
>> On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:
>>> * "Andrey A. Chernov" <ache at FreeBSD.org> [ 2007-10-27 ]
>>> [ cvs commit: src/include _ctype.h ]
>>>> ache 2007-10-27 22:32:28 UTC
>>>>
>>>> FreeBSD src repository
>>>>
>>>> Modified files:
>>>> include _ctype.h Log:
>>>> Micro-optimization of prev. commit, change
>>>> (_c < 0 || _c >= 128) to (_c & ~0x7F)
>>> Isn't that a non-optimization in code and a minor pessimization of
>>> readability?
>> ...
>> For ones who doubts there two tests compiled with -O2. As you may see the
>> result is almost identical (andl vs cmpl):
We never doubted that it was a small negative or non-optimization :-).
Look closer and you will see that the andl version takes 2 extra
instructions, since both versions are smart enough to avoid a branch,
and for this they need the result of the condition code generated by
the andl or cmpl, and the cmpl generates the desired condition code
directly while 2 more instructions are needed after the andl.
>> -------------------- a.c --------------------
>> main () {
>>
>> int c;
>>
>> return (c & ~0x7f) ? 0 : c * 2;
>> }
This example has many flaws as pointed out by Cristoph:
- c is uninitialized
- the result depends on c in a way that is quite different than the table
lookup for ctype. The above expression happens to be more optimizable.
>> -------------------- a.s --------------------
>> .file "a.c"
>> .text
>> .p2align 4,,15
>> .globl main
>> .type main, @function
>> main:
>> leal 4(%esp), %ecx
>> andl $-16, %esp
>> pushl -4(%ecx)
>> movl %eax, %edx <--- extra instruction since andl
clobbers a register. Normally,
testl should be used to avoid
this clobber.
>> andl $-128, %edx <--- this sets %edx to something
and also sets the condition
codes, but not like we want
>> addl %eax, %eax <--- c * 2
>> cmpl $1, %edx <--- this sets the condition codes
like we want
>> sbbl %edx, %edx <--- turn condition codes into a
mask in %edx: mask = 0xffffffff
if the result should be c *2
and mask = 0 if the result should
be 0
>> pushl %ebp
>> andl %edx, %eax <--- result = (c * 2) & mask
>> movl %esp, %ebp <--- why is it bothering to set up
a frame this late?
>> pushl %ecx
>> popl %ecx
>> popl %ebp
>> leal -4(%ecx), %esp
>> ret
>> .size main, .-main
>> .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]"
>> -------------------- a1.c --------------------
>> main () {
>>
>> int c;
>>
>> return (c < 0 || c >= 128) ? 0 : c * 2;
>>
>>
>> }
>> -------------------- a1.s --------------------
>> .file "a1.c"
>> .text
>> .p2align 4,,15
>> .globl main
>> .type main, @function
>> main:
>> leal 4(%esp), %ecx
>> andl $-16, %esp
>> pushl -4(%ecx)
>> addl %eax, %eax
>> cmpl $128, %eax <--- cmpl puts result in condition
codes directly where we want it
>> sbbl %edx, %edx <--- same masking stuff ...
>> andl %edx, %eax
>> pushl %ebp
>> movl %esp, %ebp
>> pushl %ecx
>> popl %ecx
>> popl %ebp
>> leal -4(%ecx), %esp
>> ret
>> .size main, .-main
>> .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]"
>
> Your example is invalid. The value of c is undefined in this function and you
> see random garbage as result (for example in the code snippet you see the c *
> 2 (addl %eax, %eax) and after that is the cmpl, which uses %eax, too). In
> fact it would be perfectly legal for the compiler to always return 0, call
> abort(), or let demons fly out of your nose.
However, the uninitialized c = %eax seems to be transformed correctly in
both cases. The first case even preserves %eax from the andl.
>
> Also the example is still unrealistic: You usually don't multiply chars by
> two. Lets try something more realistic: an ASCII filter
Indeed.
Bruce
More information about the cvs-src
mailing list