From nobody Fri Apr 21 18:03:30 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Q32T23kR4z46hVT for ; Fri, 21 Apr 2023 18:03:34 +0000 (UTC) (envelope-from yuri@aetern.org) Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Q32T21DwPz4DDZ for ; Fri, 21 Apr 2023 18:03:34 +0000 (UTC) (envelope-from yuri@aetern.org) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=aetern.org header.s=fm2 header.b=FnMIhnQj; dkim=pass header.d=messagingengine.com header.s=fm3 header.b="J rygpY4"; spf=pass (mx1.freebsd.org: domain of yuri@aetern.org designates 64.147.123.19 as permitted sender) smtp.mailfrom=yuri@aetern.org Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 6CC503200ADC for ; Fri, 21 Apr 2023 14:03:32 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Fri, 21 Apr 2023 14:03:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aetern.org; h=cc :content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1682100212; x=1682186612; bh=4GD0021YdgyLmrPbfUpHTEE2zecUXgjcrot NBS4LFCA=; b=FnMIhnQjNvm9DE0SZmb8V8Aa93j4YniJCzXjzSWKgLAy0hxlsO2 Wd+ZMZZqETD+YfAkLF/H+1OPWh0fyR01J1vQ+Xtsmn+tqs31QlF2ESD1gt8AVc2W c+mVZH9XcjSM0AP6WLD2TojygczzANb2x4wlenRD7QAHLDHa8EiAFMxpCOBGm7n3 1wH4wbRKcMtHGH57kFBb+ds2pPqcQG6rV7/oozIwnbTR0biTkRp5MM4z+awPgVdx +WpE41TRrAnUJrWOl+xiXzlwppZ/EkrZgKOiabDmBiVXVrefW9U1q5qt2qO7rJm9 QFgvSzpD+rMiJjZWnLsehzgZr0UMS99b/xQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1682100212; x= 1682186612; bh=4GD0021YdgyLmrPbfUpHTEE2zecUXgjcrotNBS4LFCA=; b=J rygpY4oPgLkKeeklZMBxmPwbZZpI6u+9oOn4xuL+vP0SCcK97Y5VD4fkwt4wmNrv 56jCcKeH3cHTaZ4C+31wujvIrAd0oa4HljssmodGzGAQR884VX42gdQQ0mwyiP93 nrunAbV5nttwUL/jpSkeu48ZgrxSme28KTdiUEz++UJN4sdFJjsK3YJMu5gIhVya xiPAG9mct7TI7A5fsBb+glyB+ImTmvnewHgFUmruxY7O45E2FfraLKW9hNVSZSXd PJldbgEpCfYDSS1QUAKclqLou5S24WeGDkfGEBtgKJxk4jssnlt0ahdoqrNjfJE+ DehYNWF2fv915EF/XW6DQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfedtgedguddukecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enfghrlhcuvffnffculddufedmnecujfgurhepkfffgggfuffvfhfhjggtgfesthekredt tdefjeenucfhrhhomhepjghurhhiuceohihurhhisegrvghtvghrnhdrohhrgheqnecugg ftrfgrthhtvghrnhepjeegtddvfeejieeutedvtdeikeffhfeuheejuedtheffveduteel geehhffgudefnecuffhomhgrihhnpehfrhgvvggsshgurdhorhhgpdhophgvnhhgrhhouh hprdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhephihurhhisegrvghtvghrnhdrohhrgh X-ME-Proxy: Feedback-ID: i0d79475b:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 21 Apr 2023 14:03:31 -0400 (EDT) Message-ID: <3e473603-f384-f176-e7cb-03409e16ec9c@aetern.org> Date: Fri, 21 Apr 2023 20:03:30 +0200 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: find(1): I18N gone wild ? Content-Language: en-US To: Current FreeBSD References: From: Yuri In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4Q32T21DwPz4DDZ X-Spamd-Bar: / X-Spamd-Result: default: False [0.60 / 15.00]; SUBJECT_ENDS_QUESTION(1.00)[]; R_DKIM_ALLOW(-0.20)[aetern.org:s=fm2,messagingengine.com:s=fm3]; R_SPF_ALLOW(-0.20)[+ip4:64.147.123.19]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; local_wl_from(0.00)[yuri@aetern.org]; DKIM_TRACE(0.00)[aetern.org:+,messagingengine.com:+]; ASN(0.00)[asn:29838, ipnet:64.147.123.0/24, country:US] X-Rspamd-Pre-Result: action=no action; module=multimap; Matched map: local_wl_from X-ThisMailContainsUnwantedMimeParts: N Mark Millard wrote: > Dimitry Andric wrote on > Date: Fri, 21 Apr 2023 10:38:05 UTC : > >> On 21 Apr 2023, at 12:01, Ronald Klop wrote: >>> Van: Poul-Henning Kamp >>> Datum: maandag, 17 april 2023 23:06 >>> Aan: current@freebsd.org >>> Onderwerp: find(1): I18N gone wild ? >>> This surprised me: >>> >>> # mkdir /tmp/P >>> # cd /tmp/P >>> # touch FOO >>> # touch bar >>> # env LANG=C.UTF-8 find . -name '[A-Z]*' -print >>> ./FOO >>> # env LANG=en_US.UTF-8 find . -name '[A-Z]*' -print >>> ./FOO >>> ./bar >>> >>> Really ?! >> ... >>> My Mac and a Linux server only give ./FOO in both cases. Just a 2 cents remark. >> >> Same here. However, I have read that with unicode, you should *never* >> use [A-Z] or [0-9], but character classes instead. That seems to give >> both files on macOS and Linux with [[:alpha:]]: >> >> $ LANG=en_US.UTF-8 find . -name '[[:alpha:]]*' -print >> ./BAR >> ./foo >> >> and only the lowercase file with [[:lower:]]: >> >> $ LANG=en_US.UTF-8 find . -name '[[:lower:]]*' -print >> ./foo >> >> But on FreeBSD, these don't work at all: >> >> $ LANG=en_US.UTF-8 find . -name '[[:alpha:]]*' -print >> >> >> $ LANG=en_US.UTF-8 find . -name '[[:lower:]]*' -print >> >> >> This is an interesting rabbit hole... :) > > FreeBSD: > > -name pattern > True if the last component of the pathname being examined matches > pattern. Special shell pattern matching characters (“[”, “]”, > “*”, and “?”) may be used as part of pattern. These characters > may be matched explicitly by escaping them with a backslash > (“\”). > > I conclude that [[:alpha:]] and [[:lower:]] were not > considered "Special shell pattern"s. "man glob" > indicates it is a shell specific builtin. > > macOS says similarly. Different shells, different > pattern notations and capabilities? Well, "man bash" > reports: [snip] > Seems like: pick your shell (as shown by echo $SHELL) and > that picks the pattern match rules used. (May be controllable > in the specific shell.) No, the pattern is not passed to shell and shell used should not matter (pattern should be properly escaped). The rules are here: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13 ...which in turn refers to the following link for bracket expressions: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05 Why we don't support all of that is different story.