kern/114578: [libc] wide character printing using swprintf(dst, n, "%ls", txt) fails depending on LC_CTYPE

Mon Sep 29 08:10:09 UTC 2008

The following reply was made to PR kern/114578; it has been noted by GNATS.

From: Christoph Mallon <christoph.mallon at gmx.de>
To: bug-followup at FreeBSD.org, das at FreeBSD.org
Cc:  
Subject: Re: kern/114578: [libc] wide character printing using swprintf(dst,
 n, "%ls", txt) fails depending on LC_CTYPE
Date: Mon, 29 Sep 2008 10:01:20 +0200

 > fputwc(3) has similar language about copying the character to the
 > output stream, but POSIX still says it can fail with EILSEQ if the
 > wide character doesn't exist in the current locale.

 fputwc() is entierly different from swprintf(): fputwc() writes to a 
 stream, swprintf() writes to an array of wchar_t.

 > This isn't my area of expertise, but the present behavior seems
 > correct.

 No, it isn't.

 > If the current locale doesn't support a given wide
 > character, we should not invent a multibyte character sequence for
 > it, because the other end of the stream may not even be able to
 > interpret it.

 The format string of swprintf() is of type wchar_t and the destination 
 buffer of swprintf() is of type wchar_t. So there are absolutely no 
 locale conversions involved and no multibyte sequences have to be 
 invented, as you suggested. All, which should happen, is copying the 
 wchar_ts from the source to the destination with no conversions involved 
 at all. The standard, which I quoted already, is quite clear in this 
 respect. The current implementation, which internally converts from 
 wchar_t to the current multibyte locale encoding and back to wchar_t is 
 just an implementation hack, which breaks, if the current locale can not 
 represent full unicode.