[Bug 272334] Misleading 'iconv -l' output
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 04 Jul 2023 20:34:35 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272334 --- Comment #1 from bruno@clisp.org --- The description contains just the first among 20 issues with the 'iconv -l' output. Here are the further ones: 2) The line ===================================================================== ARMSCII-8 AST166-8 AST_34.002 ARMSCII-8A AST166-A AST_34.002_A ===================================================================== should be split into two lines, because ARMSCII-8 and ARMSCII-8A are different encodings: ===================================================================== ARMSCII-8 AST166-8 AST_34.002 ARMSCII-8A AST166-A AST_34.002_A ===================================================================== 3) The line ===================================================================== BIG5-E BIG5E BIG-5 BIG-FIVE BIG5 BIG5-ETEN BIG5ETEN BIGFIVE CN-BIG5 CSBIG5 ===================================================================== should be split into two lines, because BIG5-E and BIG-5 are different encodings: ===================================================================== BIG5-E BIG5E BIG-5 BIG-FIVE BIG5 BIG5-ETEN BIG5ETEN BIGFIVE CN-BIG5 CSBIG5 ===================================================================== 4) The line ===================================================================== CP942 942 IBM942 942C CP942C IBM942C ===================================================================== should be split into two lines, because CP942 and CP942C are different encodings: ===================================================================== CP942 942 IBM942 CP942C 942C IBM942C ===================================================================== 5) The line ===================================================================== CP943 943 IBM943 943C CP943C IBM943C ===================================================================== should be split into two lines, because CP943 and CP943C are different encodings: ===================================================================== CP943 943 IBM943 CP943C 943C IBM943C ===================================================================== 6) The line ===================================================================== ISO646-CA CA CSA7-1 CSA_Z243.4-1985-1 ISO-IR-121 CSA7-2 CSA_Z243.4-1985-2 ISO-IR-122 ISO646-CA2 ===================================================================== should be split into two lines, because ISO646-CA and ISO646-CA2 are different encodings: ===================================================================== ISO646-CA CA CSA7-1 CSA_Z243.4-1985-1 ISO-IR-121 ISO646-CA2 CSA7-2 CSA_Z243.4-1985-2 ISO-IR-122 ===================================================================== 7) The line ===================================================================== ISO646-ES ES ISO-IR-17 ES2 ISO-IR-85 ISO646-ES2 ===================================================================== should be split into two lines, because ISO646-ES and ISO646-ES2 are different encodings: ===================================================================== ISO646-ES ES ISO-IR-17 ISO646-ES2 ES2 ISO-IR-85 ===================================================================== 8) The line ===================================================================== ISO646-FR FR ISO-IR-69 NF_Z_62-010 ISO-IR-25 ISO646-FR1 NF_Z_62-010_(1973) ===================================================================== should be split into two lines, because ISO646-FR and ISO646-FR1 are different encodings: ===================================================================== ISO646-FR FR ISO-IR-69 NF_Z_62-010 ISO646-FR1 ISO-IR-25 NF_Z_62-010_(1973) ===================================================================== 9) The line ===================================================================== ISO646-NO ISO-IR-60 NO NS_4551-1 ISO-IR-61 ISO646-NO2 NO2 NS_4551-2 ===================================================================== should be split into two lines, because ISO646-NO and ISO646-NO2 are different encodings: ===================================================================== ISO646-NO ISO-IR-60 NO NS_4551-1 ISO646-NO2 ISO-IR-61 NO2 NS_4551-2 ===================================================================== 10) The line ===================================================================== ISO646-PT ISO-IR-16 PT ISO-IR-84 ISO646-PT2 PT2 ===================================================================== should be split into two lines, because ISO646-PT and ISO646-PT2 are different encodings: ===================================================================== ISO646-PT ISO-IR-16 PT ISO646-PT2 ISO-IR-84 PT2 ===================================================================== 11) The line ===================================================================== ISO646-SE FI ISO-IR-10 ISO646-FI SE SEN_850200_B ISO-IR-11 ISO646-SE2 SE2 SEN_850200_C ===================================================================== should be split into two lines, because ISO646-SE and ISO646-SE2 are different encodings: ===================================================================== ISO646-SE FI ISO-IR-10 ISO646-FI SE SEN_850200_B ISO646-SE2 ISO-IR-11 SE2 SEN_850200_C ===================================================================== 12) The line ===================================================================== KOI8-R KOI8-RU ===================================================================== should be split into two lines, because KOI8-R and KOI8-RU are different encodings: ===================================================================== KOI8-R KOI8-RU ===================================================================== 13) The line ===================================================================== MACROMAN CSMACINTOSH MAC MACINTOSH MACROMANIA MACROMANIAN ===================================================================== should be split into two lines, because MACROMAN and MACROMANIA are different encodings: ===================================================================== MACROMAN CSMACINTOSH MAC MACINTOSH MACROMANIA MACROMANIAN ===================================================================== 14) The line ===================================================================== UTF-16 UNICODE UTF16 CSUNICODE CSUNICODE11 ISO-10646-UCS-2 UCS-2 UCS-2BE UNICODE-1-1 UNICODEBIG UTF-16BE UTF16BE UCS-2LE UNICODELITTLE UTF-16LE UTF16LE ===================================================================== should be split into two lines, because UTF-16BE and UTF-16LE are different encodings: ===================================================================== UTF-16 UNICODE UTF16 CSUNICODE CSUNICODE11 ISO-10646-UCS-2 UCS-2 UCS-2BE UNICODE-1-1 UNICODEBIG UTF-16BE UTF16BE UCS-2LE UNICODELITTLE UTF-16LE UTF16LE ===================================================================== 15) The line ===================================================================== UTF-32 CSUCS4 ISO-10646-UCS-4 UCS-4 UCS-4BE UTF-32BE UTF32BE UCS-4LE UTF-32LE UTF32LE ===================================================================== should be split into two lines, because UTF-32BE and UTF-32LE are different encodings: ===================================================================== UTF-32 CSUCS4 ISO-10646-UCS-4 UCS-4 UCS-4BE UTF-32BE UTF32BE UCS-4LE UTF-32LE UTF32LE ===================================================================== 16) The lines ===================================================================== CP10029 10029 CP10029_MACLATIN2 MACCENTEURO MACCENTRALEUROPE ===================================================================== should be joined into a single line, because these encodings are identical: ===================================================================== CP10029 10029 CP10029_MACLATIN2 MACCENTEURO MACCENTRALEUROPE ===================================================================== 17) The entry ISO646-BASIC@1983 should be removed, since iconv_open returns EINVAL for it. Then, among the the lines ===================================================================== ISO646-BASIC:1983 ISO_646.BASIC:1983 REF REF ISO646-BASIC:1983 ===================================================================== the second one should be removed, since it is part of the first line: ===================================================================== ISO646-BASIC:1983 ISO_646.BASIC:1983 REF REF ===================================================================== 18) The entry ISO646-IRV@1983 should be removed, since iconv_open returns EINVAL for it. Then, among the the lines ===================================================================== ISO646-IRV:1983 IRV ISO-IR-2 ISO646-IRV:1983 ===================================================================== the second one should be removed, since it is part of the first line: ===================================================================== ISO646-IRV:1983 IRV ISO-IR-2 ===================================================================== 19) The entry JISX0208@1990 should be removed, since iconv_open returns EINVAL for it. Then, among the the lines ===================================================================== JISX0208:1990 CSISO87JISX0208 ISO-IR-87 JIS0208 JISX0208-1990 JIS_C6226-1983 JIS_X0208 JIS_X0208-1983 JIS_X0208-1990 JIS_X0208:1990 X0208 JISX0208:1990 ===================================================================== the second one should be removed, since it is part of the first line: ===================================================================== JISX0208:1990 CSISO87JISX0208 ISO-IR-87 JIS0208 JISX0208-1990 JIS_C6226-1983 JIS_X0208 JIS_X0208-1983 JIS_X0208-1990 JIS_X0208:1990 X0208 ===================================================================== 20) The entry WINDOWS-874 occurs in two different lines: ===================================================================== CP1162 1162 CSIBM1162 IBM-1162 IBM1162 MSCP874 WINDOWS-874 CP874 874 IBM874 WINDOWS-874 ===================================================================== It should be removed from the first line, since the WINDOWS-874 encoding is identical to CP874 and different from CP1162: ===================================================================== CP1162 1162 CSIBM1162 IBM-1162 IBM1162 MSCP874 CP874 874 IBM874 WINDOWS-874 ===================================================================== As proofs, I'm attaching the encoding tables, that I got by running e.g. ./test-from WINDOWS-874 > WINDOWS-874.TXT -- You are receiving this mail because: You are the assignee for the bug.