From nobody Fri Nov 22 04:53:09 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4XvjQs4wnhz5dLkG; Fri, 22 Nov 2024 04:53:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4XvjQs2Tr7z4smG; Fri, 22 Nov 2024 04:53:09 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1732251189; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=L7u3C3rSRq1rjy9RWKgquD6Bqu0nejhfHvLpnVoWDO4=; b=uNXUGbXAEff+RZqQ7jhR1XHhWdGfj3OuJONnZLOYY3jTlXxpmUGerApAqwk8cwULxBc2u/ 7hudbU3COYjSFaossd4oZ6OsjdmuQIrn0KGfA4fqtX9A9ADqsd7w/eeO4XIBIfTjmcWr2I 9QFdQGhWncKH+E0OIjPtKndGTK7HokWyhJN1PmNNLSwFrhuyh4AsMhcw9uBeUunWPx6ime OsfjhU3v/c3C+VYJVkLEx89xPa+eXetdaGYcKBP4H2U6sqMdCNLbUtm4AHo/cUMGSIQXGe 95LHIF92fbSJAN9pVLzlGJFcovT5DU+Vhil04giyErxsm+fWwQ79/cedxt0zbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1732251189; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=L7u3C3rSRq1rjy9RWKgquD6Bqu0nejhfHvLpnVoWDO4=; b=czAggFGue783wxYdYVa7JbYF20vO2GLJmwv8BAre9PMx7VP9Qi1tTwB9zqgSTyzJZdAzE4 XNr8JeNgEl721U8VBIeDbVLVTlTnYjpRK/HMQ/VPsJPdCPrgpXdmKqUui2P5rnxPu1tT/3 nRhIaob5YsDyjS7eXGfMBk1Sl6u4+KQrubaLqyMUQVl0TwVbsTbPH3HQgxpiQnLoTFDfbd cFszhjTBVhAegpKhN+EQTxrN4gxI+V0p9Loyfvsp+Ytes0xQrVe8cmDgZFbwTOfPo3yQH+ emWa2iNIUfB9nb/t3NMIhPN9cG+VnsgZ9ocWuMsHCBtSRg8TK9zmMGRRtARZ2w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1732251189; a=rsa-sha256; cv=none; b=MInUYxhleRXrIVZ9P3yOmwFvZz7HdQk+8IwvgfkVfDr3zm8CUrRPQt3s/SzEuStPmwfcJZ Z2VRsMswx7CqOgJM8+1sIq3ks+uRqSTAgKxt87bxgVkkDzquI0RiONSiiHd1ORGtdM50DE ys7qBdePxaxcX+PRgwGhrQXHiZP86waypoXRvQ4sGA1KBduWSgo4q+CHGvJ0OZqEmEz75c bEgJlwOLGkkz4dCCKHjpq1QFsk4nlERKn1lVIy1hOhl3UwMndZ5TPm7Hza3eCS/4QK97CV TOcdEfZ7TzNQG38cJA1gq3Qo+qDdwm9ZuFhTIebMe22Og061axRX9VdXFdHc/g== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4XvjQs256tz19HZ; Fri, 22 Nov 2024 04:53:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 4AM4r9A2034474; Fri, 22 Nov 2024 04:53:09 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 4AM4r9t3034471; Fri, 22 Nov 2024 04:53:09 GMT (envelope-from git) Date: Fri, 22 Nov 2024 04:53:09 GMT Message-Id: <202411220453.4AM4r9t3034471@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Kyle Evans Subject: git: 8b30c13fd56d - stable/14 - localedata: add some exceptions to utf8proc widths List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kevans X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: 8b30c13fd56decc3a416e217349de3ecc95fe3db Auto-Submitted: auto-generated The branch stable/14 has been updated by kevans: URL: https://cgit.FreeBSD.org/src/commit/?id=8b30c13fd56decc3a416e217349de3ecc95fe3db commit 8b30c13fd56decc3a416e217349de3ecc95fe3db Author: Kyle Evans AuthorDate: 2024-11-13 22:12:42 +0000 Commit: Kyle Evans CommitDate: 2024-11-22 04:52:02 +0000 localedata: add some exceptions to utf8proc widths Hangul Jamo medial vowels and final consonants are reportedly combining characters that won't take up any columns on their own and should be reported as zero-width, so add an exception for these as well to reflect how they work in practice. This conforms to how other implementations (e.g., glibc) treat these characters. Reviewed by: bapt (earlier version), jkim Sponsored by: Klara, Inc. (cherry picked from commit 160c36eae41afa3c4944ed44778c2b48db8fbb77) --- tools/tools/locale/tools/getwidths.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/tools/tools/locale/tools/getwidths.c b/tools/tools/locale/tools/getwidths.c index 2790b8031912..63c62791253f 100644 --- a/tools/tools/locale/tools/getwidths.c +++ b/tools/tools/locale/tools/getwidths.c @@ -28,6 +28,21 @@ #include +static int +width_of(int32_t wc) +{ + + /* + * Hangul Jamo medial vowels and final consonants are more of + * a combining character, and should be considered zero-width. + */ + if (wc >= 0x1160 && wc <= 0x11ff) + return (0); + + /* No override by default, trust utf8proc's width. */ + return (utf8proc_charwidth(wc)); +} + int main(void) { @@ -43,9 +58,10 @@ main(void) wcc = utf8proc_category(wc); if (wcc == UTF8PROC_CATEGORY_CC) continue; - wcw = utf8proc_charwidth(wc); + wcw = width_of(wc); if (wcw == 1) continue; + printf("%04X %d\n", wc, wcw); }