From nobody Thu Aug 11 16:43:49 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4M3Xgn595hz4YwtP; Thu, 11 Aug 2022 16:43:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4M3Xgn4kBnz3yWh; Thu, 11 Aug 2022 16:43:49 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1660236229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=KAG+0P5MmRsP6BzSwFBczWkjaWuib2KIa8wAN3/U1U8=; b=a8Yep82DmKkH3zrXbtnfMewq2RHT1FAJIYDlFWGixGHSVNBfimuVgddEsjVLgmoNHJ4LE6 II7C1H/tXu4gF+PQTfyWxx7pAU1C49klvzTXKSso/uFzQgGM+u8XR2SKGfZrm9TQXBKglH sq6QnObh45sxgxgxnpTUwG4jH48j32f4VebwbHqQNkw7sTa26AV2bPi7mqHV28/lmVZMuQ quEaKhtEL6Cov0VkVjrkY633Ff659teTtsimx/4tCrklTIOTcoEV5Ara4LZtk2ZfKiu5GO k7L7t6Xto02lEHtuBJ8DKh3DUIDBQZ0Ot/mQXVmNfrjk1ZFQ4ql/gFnYEkWKXQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4M3Xgn3qW3zyfN; Thu, 11 Aug 2022 16:43:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 27BGhnnI082715; Thu, 11 Aug 2022 16:43:49 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 27BGhnvS082714; Thu, 11 Aug 2022 16:43:49 GMT (envelope-from git) Date: Thu, 11 Aug 2022 16:43:49 GMT Message-Id: <202208111643.27BGhnvS082714@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Kyle Evans Subject: git: 693f88c9da8d - main - iconv_std: complete the //IGNORE support List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kevans X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 693f88c9da8dccf173b40fd57d1d15504a54e9b4 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1660236229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=KAG+0P5MmRsP6BzSwFBczWkjaWuib2KIa8wAN3/U1U8=; b=uOqy9i0jrVk7IFOG+uqK9tRMupGWO/loAHoMyZ3DqlrcRNLzCFgQqLY5isRKIxiNw7JE2X VEQGBWnKFhpJIPSM4xGv33/ergzePZsJ2spoZTgqaWTYE7naLisXH96tQEJdLHML6X50Tf se+DyZt1OIdfqstf23xi4NERv5l6d5pVgeIZR+0pnXrcLvfyPLn/PpD4PB2mwMqyP4TYfY iDxnOuLWhZxOBOf3f9wggiL3y+/0pWQ/lyO3wgqlJIN6FPjuajAbspLg15q2Nd+v06s/Dx GCFoxowl9Cj7nX5Ah0i07QHxHp+kVws2dA3qlvxX1W1v+7WoXQV6mm2Y4ZXREQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1660236229; a=rsa-sha256; cv=none; b=PsVUAf0PVh8oMXxaA9sA3WwktTZ17TDcRQpvRyIPE1pyK+j5q2r4DURGBHdfRZLrLKBQNC kvoJ2+mdXYgF1U+Ytf8Fz538t16a95LsWF+pHTMMUfjcCk6x7VoMH5t99RDRAL1BY5Fov5 BhD2rBOWGahgT7mPLj73P+97uXP8WyxhBcGJRThA6OzMCJfsndqyKxd3p1+VSctJAW1ApJ mUEqlbv3aicsjSze4JfBOfwu2SE+kxg7iWe8dGV8xNJdYFwqSeSDMVdbO1GsuiD3JhPCik 8Wwq3Iht7w4tLKdH1SfygVuy0602ufNT8mdz1DESV3cNCABuW6yG+zxmuAPlIw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by kevans: URL: https://cgit.FreeBSD.org/src/commit/?id=693f88c9da8dccf173b40fd57d1d15504a54e9b4 commit 693f88c9da8dccf173b40fd57d1d15504a54e9b4 Author: Kyle Evans AuthorDate: 2022-02-22 07:15:04 +0000 Commit: Kyle Evans CommitDate: 2022-08-11 16:42:20 +0000 iconv_std: complete the //IGNORE support Previously, it would only ignore failures due to csmapper conversion failure. It may be the case that the input string contains invalid sequences that also need to be ignored. A good example of //IGNORE application is sanitizing user- or remotely- specified strings that are expected to be UTF-8; perhaps as part of a pipeline that will feed the result into a system less tested against or tolerant of illegal UTF-8 sequences. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D34345 --- lib/libiconv_modules/iconv_std/citrus_iconv_std.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/lib/libiconv_modules/iconv_std/citrus_iconv_std.c b/lib/libiconv_modules/iconv_std/citrus_iconv_std.c index ec9f21de541e..73dc75abacbb 100644 --- a/lib/libiconv_modules/iconv_std/citrus_iconv_std.c +++ b/lib/libiconv_modules/iconv_std/citrus_iconv_std.c @@ -472,7 +472,7 @@ _citrus_iconv_std_iconv_convert(struct _citrus_iconv * __restrict cv, _csid_t csid; _index_t idx; char *tmpin; - size_t inval, szrin, szrout; + size_t inval, in_mb_cur_min, szrin, szrout; int ret, state = 0; inval = 0; @@ -504,6 +504,8 @@ _citrus_iconv_std_iconv_convert(struct _citrus_iconv * __restrict cv, return (0); } + in_mb_cur_min = _stdenc_get_mb_cur_min(is->is_src_encoding); + /* normal case */ for (;;) { if (*inbytes == 0) { @@ -522,8 +524,20 @@ _citrus_iconv_std_iconv_convert(struct _citrus_iconv * __restrict cv, szrin = szrout = 0; ret = mbtocsx(&sc->sc_src_encoding, &csid, &idx, &tmpin, *inbytes, &szrin, cv->cv_shared->ci_hooks); - if (ret) + if (ret != 0 && (ret != EILSEQ || + !cv->cv_shared->ci_discard_ilseq)) { goto err; + } else if (ret == EILSEQ) { + /* + * If //IGNORE was specified, we'll just keep crunching + * through invalid characters. + */ + *in += in_mb_cur_min; + *inbytes -= in_mb_cur_min; + restore_encoding_state(&sc->sc_src_encoding); + restore_encoding_state(&sc->sc_dst_encoding); + continue; + } if (szrin == (size_t)-2) { /* incompleted character */