cvs commit: src/include _ctype.h
Christoph Mallon
christoph.mallon at gmx.de
Wed Oct 31 19:45:11 PDT 2007
Andrey Chernov wrote:
> On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:
>> * "Andrey A. Chernov" <ache at FreeBSD.org> [ 2007-10-27 ]
>> [ cvs commit: src/include _ctype.h ]
>>> ache 2007-10-27 22:32:28 UTC
>>>
>>> FreeBSD src repository
>>>
>>> Modified files:
>>> include _ctype.h
>>> Log:
>>> Micro-optimization of prev. commit, change
>>> (_c < 0 || _c >= 128) to (_c & ~0x7F)
>> Isn't that a non-optimization in code and a minor pessimization of readability?
>> Maybe I'm getting rusty, but those seem to result in nearly identical code on
>> i386 with a relatively modern GCC. Did you look at the compiler output for this
>> optimization? Is there a specific expensive instruction you're trying to avoid?
>> For such thoroughyl bit-aligned range checks, you shouldn't even get a branch
>> for the former case. Is there a platform other than i386 I should look at where
>> the previous expression is more clearly pessimized? Or a different compiler
>> than GCC?
>
> For ones who doubts there two tests compiled with -O2. As you may see the
> result is almost identical (andl vs cmpl):
> -------------------- a.c --------------------
> main () {
>
> int c;
>
> return (c & ~0x7f) ? 0 : c * 2;
> }
> -------------------- a.s --------------------
> .file "a.c"
> .text
> .p2align 4,,15
> .globl main
> .type main, @function
> main:
> leal 4(%esp), %ecx
> andl $-16, %esp
> pushl -4(%ecx)
> movl %eax, %edx
> andl $-128, %edx
> addl %eax, %eax
> cmpl $1, %edx
> sbbl %edx, %edx
> pushl %ebp
> andl %edx, %eax
> movl %esp, %ebp
> pushl %ecx
> popl %ecx
> popl %ebp
> leal -4(%ecx), %esp
> ret
> .size main, .-main
> .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]"
> -------------------- a1.c --------------------
> main () {
>
> int c;
>
> return (c < 0 || c >= 128) ? 0 : c * 2;
>
>
> }
> -------------------- a1.s --------------------
> .file "a1.c"
> .text
> .p2align 4,,15
> .globl main
> .type main, @function
> main:
> leal 4(%esp), %ecx
> andl $-16, %esp
> pushl -4(%ecx)
> addl %eax, %eax
> cmpl $128, %eax
> sbbl %edx, %edx
> andl %edx, %eax
> pushl %ebp
> movl %esp, %ebp
> pushl %ecx
> popl %ecx
> popl %ebp
> leal -4(%ecx), %esp
> ret
> .size main, .-main
> .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]"
Your example is invalid. The value of c is undefined in this function
and you see random garbage as result (for example in the code snippet
you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which
uses %eax, too). In fact it would be perfectly legal for the compiler to
always return 0, call abort(), or let demons fly out of your nose.
Also the example is still unrealistic: You usually don't multiply chars
by two. Lets try something more realistic: an ASCII filter
int filter_ascii0(int c)
{
return c < 0 || c >= 128 ? '?' : c;
}
int filter_ascii1(int c)
{
return c & ~0x7F ? '?' : c;
}
Especially mind that c is not dead after the condition. Even if your
example did not used an undefined value, the value of c is dead after
the test, which is unlikely for typical string handling code.
And now the compiled code (GCC 3.4.6 with -O2 -march=athlon-xp
-fomit-frame-pointer - I used these switches to get more compact code.
It has no influence on the condition test.):
00000000 <filter_ascii0>:
0: 8b 54 24 04 mov 0x4(%esp),%edx
4: b8 3f 00 00 00 mov $0x3f,%eax
9: 83 fa 7f cmp $0x7f,%edx
c: 0f 46 c2 cmovbe %edx,%eax
f: c3 ret
00000010 <filter_ascii1>:
10: 8b 54 24 04 mov 0x4(%esp),%edx
14: b8 3f 00 00 00 mov $0x3f,%eax
19: f7 c2 80 ff ff ff test $0xffffff80,%edx
1f: 0f 44 c2 cmove %edx,%eax
22: c3 ret
You see there is a test instruction used in filter_ascii1, because the
value in %edx does not die at the test, but is used again in the cmove.
Christoph
More information about the cvs-src
mailing list