GSoC proposal: Quirinus C library (qc)
Dmitry Selyutin
ghostman.sd at gmail.com
Sun Mar 2 10:10:18 UTC 2014
Hi Edward,
there is no such thing as different UTF-8 encodings. If you talk about e.g.
accents and diacritics representation, actually there are normalization
forms which apply to UCS points rather than to UTF-8 byte sequences. If you
mean the fact that the same UCS-4 code point can be represented as
different byte sequence, only the shortest form is permitted.
Honestly I think that UTF-8 is the only encoding that has right to live.
Other encodings seem to die or to be dead already.
С уважением,
Дмитрий Селютин
02.03.2014 13:54 пользователь "Edward Tomasz Napierała" <trasz at freebsd.org>
написал:
> Wiadomość napisana przez John-Mark Gurney w dniu 27 lut 2014, o godz.
> 19:26:
> > Dmitry Selyutin wrote this message on Thu, Feb 27, 2014 at 19:39 +0400:
> >> As for strings, I will not use UTF-16 since it provides more problems
> >> rather than solutions. If I provide a function which accepts char* or
> char
> >> const* argument, I imply that such function uses only ASCII (may be I
> will
> >> change ASCII to UTF-8). Encoding is used only if a user has requested it
> >> explicitly; the only place where I have made exception is system path
> since
> >> path requires to be in UTF-16 on Windows. That is the reason why qc_path
> >> requires qc_codecs-related functions.
> >
> > You do realize that FreeBSD does not enforce any coding on path names
> > current, correct? So, requiring a coding format on FreeBSD (UTF-16)
> > will mean some paths may not be accessible, since I assume you conver
> > the UTF-16 string to UTF-8 before opening on FreeBSD...
> >
> > Hmm.. maybe it's time for a sysctl you can set on your system that
> > only allows you to create UTF-8 valid names to allow people to slowly
> > migrate to UTF-8? and a tool to report/convert old non-UTF-8 paths?
>
> There's already a ZFS property ("utfmode") exactly for this purpose.
>
> Actually, its funnier than that: because the kernel doesn't know anything
> about UTF-8, one can create several files with the same name, but with
> different UTF-8 encodings. And there is ZFS property to fix this problem
> as well ("normalization").
>
>
More information about the freebsd-hackers
mailing list