cvs commit: src/usr.bin/tr Makefile cmap.c cmap.h cset.c cset.h
extern.h str.c tr.c
Tim J. Robbins
tjr at FreeBSD.org
Thu Jul 8 19:08:07 PDT 2004
tjr 2004-07-09 02:08:07 UTC
FreeBSD src repository
Modified files:
usr.bin/tr Makefile extern.h str.c tr.c
Added files:
usr.bin/tr cmap.c cmap.h cset.c cset.h
Log:
Add support for multibyte characters. The challenge here was to use
data structures that scale better with large character sets, instead of
arrays indexed by character value:
- Sets of characters to delete/squeeze are stored in a new "cset" structure,
which is implemented as a splay tree of extents. This structure has the
ability to store character classes (ala wctype(3)), but this is not
currently fully utilized.
- Mappings between characters are stored in a new "cmap" structure, which
is also a splay tree.
- The parser no longer builds arrays containing all the characters in a
particular class; instead, next() determines them on-the-fly using
nextwctype(3).
Revision Changes Path
1.2 +2 -1 src/usr.bin/tr/Makefile
1.1 +212 -0 src/usr.bin/tr/cmap.c (new)
1.1 +83 -0 src/usr.bin/tr/cmap.h (new)
1.1 +303 -0 src/usr.bin/tr/cset.c (new)
1.1 +75 -0 src/usr.bin/tr/cset.h (new)
1.9 +11 -10 src/usr.bin/tr/extern.h
1.23 +78 -87 src/usr.bin/tr/str.c
1.22 +116 -102 src/usr.bin/tr/tr.c
More information about the cvs-all
mailing list