[Bug 257972] collating sequence not sensible in some locales
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 257972] collating sequence not sensible in some locales"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 20 Aug 2021 14:13:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257972 Bug ID: 257972 Summary: collating sequence not sensible in some locales Product: Base System Version: 13.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Many People Priority: --- Component: standards Assignee: standards@FreeBSD.org Reporter: freebsd@oldach.net As discussed in https://lists.freebsd.org/archives/freebsd-stable/2021-August/000193.html > > # uname -a > > FreeBSD 13STABLE 13.0-STABLE FreeBSD 13.0-STABLE #49 stable/13-n246779-64085efb677-dirty: Mon Aug 16 08:42:53 CEST 2021 root@XXX amd64 > > # export LANG=en_US.ISO8859-1 > > # (echo bla; echo Bla) | grep '[A-Z]' > > bla > > Bla > > This one is unexpected, the upper case should be a range of its own > and should not include any lower case letters. > > # export LANG=en_US.UTF-8 > > # (echo bla; echo Bla) | grep '[A-Z]' > > Bla > > Here I had expected the result you got with en_US.ISO8859-1 ... > > For comparison, a Linux RHEL box delivers the expected results: > > > > # uname -a > > Linux rhel.local 3.10.0-1062.9.1.el7.x86_64 #1 SMP Mon Dec 2 08:31:54 EST 2019 x86_64 x86_64 x86_64 GNU/Linux > > # export LANG=en_US.ISO8859-1 > > # (echo bla; echo Bla) | grep '[A-Z]' > > Bla > > # export LANG=en_US.UTF-8 > > # (echo bla; echo Bla) | grep '[A-Z]' > > Bla > > Seems that this version uses a POSIX style collating sequence for UTF-8. > Definitely a bug in the definition of the collating sequences. > > And I have just verified that de_DE.ISO8859-1 wrongly considers "รถ" > to be within [a-z], while de_DE.UTF-8 does not (but should). > > Seems that the correct collating sequences for ISO8859-1 and UTF-8 are > each assigned to the other one. Can some knowledgeable person please validate? -- You are receiving this mail because: You are the assignee for the bug.