Re: find(1): I18N gone wild ?
- In reply to: Ronald Klop : "Re: find(1): I18N gone wild ?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 21 Apr 2023 10:38:05 UTC
On 21 Apr 2023, at 12:01, Ronald Klop <ronald-lists@klop.ws> wrote: > Van: Poul-Henning Kamp <phk@phk.freebsd.dk> > Datum: maandag, 17 april 2023 23:06 > Aan: current@freebsd.org > Onderwerp: find(1): I18N gone wild ? > This surprised me: > > # mkdir /tmp/P > # cd /tmp/P > # touch FOO > # touch bar > # env LANG=C.UTF-8 find . -name '[A-Z]*' -print > ./FOO > # env LANG=en_US.UTF-8 find . -name '[A-Z]*' -print > ./FOO > ./bar > > Really ?! ... > My Mac and a Linux server only give ./FOO in both cases. Just a 2 cents remark. Same here. However, I have read that with unicode, you should *never* use [A-Z] or [0-9], but character classes instead. That seems to give both files on macOS and Linux with [[:alpha:]]: $ LANG=en_US.UTF-8 find . -name '[[:alpha:]]*' -print ./BAR ./foo and only the lowercase file with [[:lower:]]: $ LANG=en_US.UTF-8 find . -name '[[:lower:]]*' -print ./foo But on FreeBSD, these don't work at all: $ LANG=en_US.UTF-8 find . -name '[[:alpha:]]*' -print <nothing> $ LANG=en_US.UTF-8 find . -name '[[:lower:]]*' -print <nothing> This is an interesting rabbit hole... :) -Dimitry