svn commit: r303746 - head/usr.bin/indent
Piotr Stefaniak
email at piotr-stefaniak.me
Sun Aug 7 08:20:40 UTC 2016
On 2016-08-06 22:18, Pedro Giffuni wrote:
> On 06/08/2016 15:13, Ed Schouten wrote:
>> 2016-08-04 17:27 GMT+02:00 Pedro F. Giffuni <pfg at freebsd.org>:
>>> Log:
>>> indent(1): Use bsearch() for looking up type keywords.
>> You're never doing any deletions, right? Would it make more sense to
>> use hsearch_r() in that case?
>>
> Indeed, good idea, although it may be an issue if portability to Windows
> is desired.
I think I could have done better job describing the reasons behind that
commit, because there are two of those.
The main thing is that I wanted to get rid of arbitrary limit of 16384
"specials" (keywords, including typedef names) -- and the
malloc/realloc/bsearch based implementation of looking up type keywords
solved that, while being relatively simple and easy to follow.
It was also very nice to notice a significant performance improvement
since I would re-indent whole src/ directory of the PostgreSQL project
many times while testing changes I made. But it's not that important
after all, because PostgreSQL sources get re-indented very rarely.
Having said that, I felt compelled to compare performance of bsearch()
to that of hsearch().
First thing to say here is that I did the testing on a Linux+glibc
platform, because 1) that's my main development platform, and 2) it's
probably the only one where anybody re-indents hundreds of thousands of
lines of code with indent(1) in one go, often, and multiple times a day
-- and therefore can actually notice the difference.
Secondly, while hsearch_r() being a GNU extension is an argument against
using it, in this particular case there would be no advantage to be
gained by preferring the re-entrant version, I believe -- which renders
the lower portability argument moot. So what I tested was hsearch().
And my implementation of hsearch() based typedef name lookup (diff
attached) wasn't significantly faster than bsearch(). Obviously I tested
the glibc version and perhaps FreeBSD hsearch() is faster, but I don't
expect it to be significantly faster.
The most important conclusion, though, is that at least the glibc
implementation imposes an arbitrary limit on how many items you can add
to the data structure:
The argument nel specifies the maximum number of entries in the table.
(This maximum cannot be changed later, so choose it wisely.)
Adding new items to the data structure (or searching them - I didn't
investigate) actually _stopped working_ when approximately /nel/ number
of elements was reached (I accidentally set it to 2000 while PostgreSQL
has nearly 3000 typedef names). I don't know if FreeBSD's implementation
does the same, but it's a show-stopper to me anyway, so personally I
won't be pursuing this idea of replacing bsearch() with hsearch().
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lexi.c.diff
Type: text/x-patch
Size: 2121 bytes
Desc: lexi.c.diff
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20160807/f038d87c/attachment.bin>
More information about the svn-src-head
mailing list