svn commit: r250215 - stable/9/lib/libc/locale

Sergey Kandaurov pluknet at freebsd.org
Sat May 4 12:03:17 UTC 2013


On 4 May 2013 15:14, Andrey Chernov <ache at freebsd.org> wrote:
> On 04.05.2013 0:48, Sergey Kandaurov wrote:
>> On 3 May 2013 23:55, Jilles Tjoelker <jilles at stack.nl> wrote:
>>> Some sort of perfect hashing can also be an option, although it makes it
>>> harder to add new properties or adds a build dependency on gperf(1) that
>>> we would like to get rid of.
>> I hacked a bit on wctype. Speaking about speed, it shows about 1-3.5x
>> improvement over the previous fast version (before r250215).
>>
>> Time spend for 2097152 wctype() calls for each of wctype property
>>                 current         previous        mine
>> alnum           0.090554676     0.035821210     0.033270579
>> alpha           0.172074310     0.052461036     0.044916572
>> blank           0.261109989     0.055735281     0.036682745
>> cntrl           0.357318986     0.069249831     0.038292782
>> digit           0.436381530     0.094194364     0.039249005
>> graph           0.540954812     0.085580099     0.043331460
>> lower           0.618306476     0.095665215     0.044070399
>> print           0.707443135     0.132559305     0.048216097
>> punct           0.788922052     0.142809109     0.062871432
>> space           0.888263108     0.150516644     0.054086142
>> upper           0.966903461     0.173593592     0.054027834
>> xdigit          0.406611275     0.201614227     0.060695939
>> ideogram        0.439763499     0.239640723     0.068566486
>> special         0.523128094     0.249156298     0.099278051
>> phonogram       0.564975870     0.260972651     0.135751471
>> rune            0.637392247     0.235195497     0.064093971
>>
>> Index: locale/wctype.c
>> ===================================================================
>> --- locale/wctype.c     (revision 250217)
>> +++ locale/wctype.c     (working copy)
>> @@ -74,6 +74,9 @@
>>                 "special\0"     /* BSD extension */
>>                 "phonogram\0"   /* BSD extension */
>>                 "rune\0";       /* BSD extension */
>> +       static const size_t propnamlen[] = {
>> +               5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 8, 7, 9, 4, 0
>> +       };
>>         static const wctype_t propmasks[] = {
>>                 _CTYPE_A|_CTYPE_D,
>>                 _CTYPE_A,
>> @@ -92,16 +95,17 @@
>>                 _CTYPE_Q,
>>                 0xFFFFFF00L
>>         };
>> -       size_t len1, len2;
>> +       const size_t *len2;
>>         const char *p;
>>         const wctype_t *q;
>>
>> -       len1 = strlen(property);
>>         q = propmasks;
>> -       for (p = propnames; (len2 = strlen(p)) != 0; p += len2 + 1) {
>> -               if (len1 == len2 && memcmp(property, p, len1) == 0)
>> +       len2 = propnamlen;
>> +       for (p = propnames; *len2 != 0; ) {
>> +               if (property[0] == p[0] && strcmp(property, p) == 0)
>>                         return (*q);
>> -               q++;
>> +               p += *len2 + 1;
>> +               q++; len2++;
>>         }
>>
>>         return (0UL);
>>
[...]
>
> BTW, I don't run tests and look in asm code for sure, but it seems
> property[0] == p[0] is unneeded because almost every compiler tries to
> inline strcmp().

Doesn't seem so (in-lining), see below.

Apparently property[0] == p[0] is cheaper than strcmp() for negative checks.
Removing this condition brings perf. numbers back to the "previous" column.
Looking into asm:

  # property[0] == p[0]
  4d:   44 3a 75 00             cmp    0x0(%rbp),%r14b
  51:   75 dd                   jne    30 <wctype_l+0x30>
  # strcmp()
  53:   48 89 ee                mov    %rbp,%rsi
  56:   4c 89 ff                mov    %r15,%rdi
  59:   e8 00 00 00 00          callq  5e <wctype_l+0x5e>
  5e:   85 c0                   test   %eax,%eax
  60:   75 ce                   jne    30 <wctype_l+0x30>

-- 
wbr,
pluknet


More information about the svn-src-stable mailing list