Why no non-latin TODIGIT mappings in UTF-8.src ?
Wolfgang Zenker
wolfgang at lyxys.ka.sub.org
Mon May 28 12:34:59 UTC 2007
* Andrey Chernov <ache at freebsd.org> [070528 13:52]:
> On Mon, May 28, 2007 at 10:46:59AM +0200, Wolfgang Zenker wrote:
>> Looking at our UTF-8.src, I see
>> $ grep DIGIT UTF-8.src
>> DIGIT '0' - '9'
>> XDIGIT '0' - '9' 'A' - 'F' 'a' - 'f'
>> TODIGIT < '0' - '9' : 0x0000 >
>> TODIGIT < 'A' - 'F' : 10 > < 'a' - 'f' : 10 >
>> It appears to me that isdigit() behaviour is controlled by the DIGIT
>> keyword, not TODIGIT. However, I do admit that I don't understand completely
>> how locale files are supposed to work. So where does e.g. iswdigit() get
>> its character class information from, should that not be in the locale
>> information as well somewhere?
> There is no POSIX function to extract TODIGIT info, so it is useless for
> now.
Ok, so the mklocale src files that DO provide additional TODIGIT mappings
(like e.g. am_ET.UTF-8.src or ja_JP.SJIS.src) do so just to be prepared
for the day we can use them?
> todigit() is SCO extension and its manpage says:
> The macro todigit returns the digit character corresponding to its integer
> argument. The argument must be in the range 0-9, otherwise the behavior is
> undefined.
> iswdigit() have the same 0-9 restriction as isdigit() just accepts wint_t
I had imagined that TODIGIT would be used for a locale-aware version of
digittoint(3) or something like that. What would be a good place to read
up about how much can be localised with locales and how much of it we
currently (and maybe in the near future) support?
Wolfgang
More information about the freebsd-i18n
mailing list