[Bug 278229] iconv mapping tables for ISO 8859-2 and 8859-3 contain garbage
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 278229] iconv mapping tables for ISO 8859-2 and 8859-3 contain garbage"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 278229] iconv mapping tables for ISO 8859-2 and 8859-3 contain garbage"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 278229] iconv mapping tables for ISO 8859-2 and 8859-3 contain garbage"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 07 Apr 2024 11:03:31 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278229 Bug ID: 278229 Summary: iconv mapping tables for ISO 8859-2 and 8859-3 contain garbage Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: bin Assignee: bugs@FreeBSD.org Reporter: eichelberg@offis.de Created attachment 249799 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=249799&action=edit Fixed mapping tables for ISO-8859-2, -3 and -5 The iconv mapping tables from Unicode to ISO-8859-2 and ISO-8859-3 contain many incorrect mappings where Unicode characters not available in the ISO character set are mapped to four character sequences essentially containing garbage. In the FreeBSD source tree, the source files for these mapping tables are located in /share/i18n/csmapper/ISO-8859/. It should be noted that the mapping tables for ISO 8859-4 to 8859-16 are between 16 and 24 kBytes, whereas the mapping tables for ISO 8859-2 and ISO 8859-3 are over a megabyte in size. Apparently the majority of mappings that map one Unicode code position to a sequence of four ISO characters contain garbage. This issue was already present in the initial commit for these files in February 2011 and has apparently never been noticed. Attached to this bug report are corrected mapping tables for ISO 8859-2 and ISO 8859-3. These retain all mappings to four-byte character sequences that are also present in the other ISO 8859 mapping tables and remove all others. Furthermore, the table for ISO 8859-5 is also attached. This currently contains many duplicate lines, which do not cause problems when processed, but are unneccessary and should be removed. -- You are receiving this mail because: You are the assignee for the bug.