tr(1) and LANG=de_DE.UTF-8
Matthias Apitz
guru at unixarea.de
Thu Oct 29 10:53:40 UTC 2015
Hello,
I was wondering why I could not patch a byte \357 in a file with tr(1):
[guru at kant-r269739 ~]$ od -c /tmp/x
0000000 n o n U T F - 8 \n n o n U T
0000020 F - 8 \n v a l i d U T F - 8 \n
0000040 H e l l o W o r l d ! \n v a
0000060 l i d U T F - 8 \n H e l l o
0000100 357 277 277 W o r l d ! \n
0000113
[guru at kant-r269739 ~]$ LANG=de_DE.UTF-8
tr '\357' '\000' < /tmp/x | od -c
0000000 n o n U T F - 8 \n n o n U T
0000020 F - 8 \n v a l i d U T F - 8 \n
0000040 H e l l o W o r l d ! \n v a
0000060 l i d U T F - 8 \n H e l l o
0000100 357 277 277 W o r l d ! \n
0000113
until I changed the LANG to C:
[guru at kant-r269739 ~]$ LANG=C tr '\357'
'\000' < /tmp/x | od -c
0000000 n o n U T F - 8 \n n o n U T
0000020 F - 8 \n v a l i d U T F - 8 \n
0000040 H e l l o W o r l d ! \n v a
0000060 l i d U T F - 8 \n H e l l o
0000100 \0 277 277 W o r l d ! \n
0000113
I know that the man page of tr(1) contains a hint about the LANG and
environment(7), but would not expect that this means that I can't change
a single byte, octal given value, only for the reason that \357 is not a valid
Unicode code point.
Any ideas/comments on this?
Thanks
matthias
--
Matthias Apitz | /"\ ASCII Ribbon Campaign:
E-mail: guru at unixarea.de | \ / - No HTML/RTF in E-mail
WWW: http://www.unixarea.de/ | X - No proprietary attachments
phone: +49-176-38902045 | / \ - Respect for open standards
| en.wikipedia.org/wiki/ASCII_Ribbon_Campaign
More information about the freebsd-questions
mailing list