cvs commit: src/include _ctype.h
Christoph Mallon
christoph.mallon at gmx.de
Wed Oct 31 18:52:31 PDT 2007
Andrey Chernov wrote:
> On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote:
>> Andrey A. Chernov wrote:
>>> ache 2007-10-27 22:32:28 UTC
>>> FreeBSD src repository
>>> Modified files:
>>> include _ctype.h Log:
>>> Micro-optimization of prev. commit, change
>>> (_c < 0 || _c >= 128) to (_c & ~0x7F)
>>> Revision Changes Path
>>> 1.33 +1 -1 src/include/_ctype.h
>> Actually this is rather a micro-pessimisation. Every compiler worth its
>> money transforms the range check into single unsigned comparison. The
>> latter test on the other hand on x86 gets probably transformed into a test
>> instruction. This instruction has no form with sign extended 8bit
>> immediate, but only with 32bit immediate. This results in a significantly
>> longer opcode (three bytes more) than a single (unsigned)_c > 127, which a
>> sane compiler produces. I suspect some RISC machines need one more
>> instruction for the "micro-optimised" code, too.
>> In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c >
>> 127, but it does not do this (the only compiler I found, which does this
>> transformation, is LLVM).
>> Further IMO it is hard to decipher what _c & ~0x7F is supposed to do.
>
> 1. My variant is compiler optimization level independent. F.e. without
> optimization completely there is no range check transform you talk about
> at all and very long asm code is generated. I also mean the case where gcc
> optimization bug was avoided, removing optimization (like compiling large
> part of Xorg server recently), using non-gcc compilers etc. cases.
Compiling without any optimisations makes the code slow for a zillion
other reasons (no load/store optimisations, constant folding, common
subexpression elimination, if-conversion, partial redundant expression
elimination, strength reduction, reassociation, code placement, and many
more), so a not transformed range check is really not of any concern.
> 2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously
> big problems to decifer. I just want to keep all ctype in style.
Repeating cryptic code does not make it better, IMO.
> 3. I see no "longer opcode (three bytes more)" you talk about in my tests
> (andl vs cmpl was there, no testl).
See the reply to the mail with your code example.
Christoph
More information about the cvs-src
mailing list