svn commit: r250215 - stable/9/lib/libc/locale
Sergey Kandaurov
pluknet at freebsd.org
Sat May 4 12:03:17 UTC 2013
On 4 May 2013 15:14, Andrey Chernov <ache at freebsd.org> wrote:
> On 04.05.2013 0:48, Sergey Kandaurov wrote:
>> On 3 May 2013 23:55, Jilles Tjoelker <jilles at stack.nl> wrote:
>>> Some sort of perfect hashing can also be an option, although it makes it
>>> harder to add new properties or adds a build dependency on gperf(1) that
>>> we would like to get rid of.
>> I hacked a bit on wctype. Speaking about speed, it shows about 1-3.5x
>> improvement over the previous fast version (before r250215).
>>
>> Time spend for 2097152 wctype() calls for each of wctype property
>> current previous mine
>> alnum 0.090554676 0.035821210 0.033270579
>> alpha 0.172074310 0.052461036 0.044916572
>> blank 0.261109989 0.055735281 0.036682745
>> cntrl 0.357318986 0.069249831 0.038292782
>> digit 0.436381530 0.094194364 0.039249005
>> graph 0.540954812 0.085580099 0.043331460
>> lower 0.618306476 0.095665215 0.044070399
>> print 0.707443135 0.132559305 0.048216097
>> punct 0.788922052 0.142809109 0.062871432
>> space 0.888263108 0.150516644 0.054086142
>> upper 0.966903461 0.173593592 0.054027834
>> xdigit 0.406611275 0.201614227 0.060695939
>> ideogram 0.439763499 0.239640723 0.068566486
>> special 0.523128094 0.249156298 0.099278051
>> phonogram 0.564975870 0.260972651 0.135751471
>> rune 0.637392247 0.235195497 0.064093971
>>
>> Index: locale/wctype.c
>> ===================================================================
>> --- locale/wctype.c (revision 250217)
>> +++ locale/wctype.c (working copy)
>> @@ -74,6 +74,9 @@
>> "special\0" /* BSD extension */
>> "phonogram\0" /* BSD extension */
>> "rune\0"; /* BSD extension */
>> + static const size_t propnamlen[] = {
>> + 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 8, 7, 9, 4, 0
>> + };
>> static const wctype_t propmasks[] = {
>> _CTYPE_A|_CTYPE_D,
>> _CTYPE_A,
>> @@ -92,16 +95,17 @@
>> _CTYPE_Q,
>> 0xFFFFFF00L
>> };
>> - size_t len1, len2;
>> + const size_t *len2;
>> const char *p;
>> const wctype_t *q;
>>
>> - len1 = strlen(property);
>> q = propmasks;
>> - for (p = propnames; (len2 = strlen(p)) != 0; p += len2 + 1) {
>> - if (len1 == len2 && memcmp(property, p, len1) == 0)
>> + len2 = propnamlen;
>> + for (p = propnames; *len2 != 0; ) {
>> + if (property[0] == p[0] && strcmp(property, p) == 0)
>> return (*q);
>> - q++;
>> + p += *len2 + 1;
>> + q++; len2++;
>> }
>>
>> return (0UL);
>>
[...]
>
> BTW, I don't run tests and look in asm code for sure, but it seems
> property[0] == p[0] is unneeded because almost every compiler tries to
> inline strcmp().
Doesn't seem so (in-lining), see below.
Apparently property[0] == p[0] is cheaper than strcmp() for negative checks.
Removing this condition brings perf. numbers back to the "previous" column.
Looking into asm:
# property[0] == p[0]
4d: 44 3a 75 00 cmp 0x0(%rbp),%r14b
51: 75 dd jne 30 <wctype_l+0x30>
# strcmp()
53: 48 89 ee mov %rbp,%rsi
56: 4c 89 ff mov %r15,%rdi
59: e8 00 00 00 00 callq 5e <wctype_l+0x5e>
5e: 85 c0 test %eax,%eax
60: 75 ce jne 30 <wctype_l+0x30>
--
wbr,
pluknet
More information about the svn-src-stable
mailing list