svn commit: r265361 - stable/10/lib/libc/locale
Pedro F. Giffuni
pfg at FreeBSD.org
Mon May 5 14:50:53 UTC 2014
Author: pfg
Date: Mon May 5 14:50:53 2014
New Revision: 265361
URL: http://svnweb.freebsd.org/changeset/base/265361
Log:
MFC r265095, r265167;
citrus: Avoid invalid code points.
The UTF-8 decoder should not accept byte sequences which decode to
unicode code positions U+D800 to U+DFFF (UTF-16 surrogates).[1]
Contrary to the original OpenBSD patch, we do pass U+FFFE and U+FFFF,
both values are valid "non-characters" [2] and must be mapped through
UTFs.
[1] http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
[2] http://www.unicode.org/faq/private_use.html
Reported by: Stefan Sperling [1]
Thanks to: jilles [2]
Obtained from: OpenBSD
Modified:
stable/10/lib/libc/locale/utf8.c
Modified: stable/10/lib/libc/locale/utf8.c
==============================================================================
--- stable/10/lib/libc/locale/utf8.c Mon May 5 14:50:44 2014 (r265360)
+++ stable/10/lib/libc/locale/utf8.c Mon May 5 14:50:53 2014 (r265361)
@@ -203,6 +203,13 @@ _UTF8_mbrtowc(wchar_t * __restrict pwc,
errno = EILSEQ;
return ((size_t)-1);
}
+ if (wch >= 0xd800 && wch <= 0xdfff) {
+ /*
+ * Malformed input; invalid code points.
+ */
+ errno = EILSEQ;
+ return ((size_t)-1);
+ }
if (pwc != NULL)
*pwc = wch;
us->want = 0;
More information about the svn-src-stable
mailing list