[Bug 243229] awk length() function in base system produces an incorrect results for UTF-8 strings
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 19 May 2021 14:59:46 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243229 Frédéric Fauberteau <triaxx@NetBSD.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |triaxx@NetBSD.org --- Comment #2 from Frédéric Fauberteau <triaxx@NetBSD.org> --- I don't know if this issue is related to that bug report, but the following command prints 'bin': % echo "bin" | LANG=en_US awk '$1 ~ /^[\t -~]/ {print $0}' while this one prints nothing: echo "bin" | LANG=en_US.UTF-8 awk '$1 ~ /^[\t -~]/ {print $0}' The range from ' ' to '~' includes alphabetical characters when the locale is not utf-8 but does not when the locale is utf-8. We can notice that '/^[\t -~]/' matches "bin" with C.UTF-8. -- You are receiving this mail because: You are the assignee for the bug.