git: 41b768ae1970 - stable/13 - contrib/expat: import expat 2.7.1
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 05 Apr 2025 03:19:42 UTC
The branch stable/13 has been updated by philip: URL: https://cgit.FreeBSD.org/src/commit/?id=41b768ae1970ed484abaaea401453c3902df93c2 commit 41b768ae1970ed484abaaea401453c3902df93c2 Author: Philip Paeps <philip@FreeBSD.org> AuthorDate: 2025-04-02 08:56:02 +0000 Commit: Philip Paeps <philip@FreeBSD.org> CommitDate: 2025-04-05 03:19:08 +0000 contrib/expat: import expat 2.7.1 Changes: https://github.com/libexpat/libexpat/blob/R_2_7_1/expat/Changes https://github.com/libexpat/libexpat/blob/R_2_7_0/expat/Changes Security: CVE-2024-8176 (cherry picked from commit fe9278888fd4414abe2d922e469cf608005f4c65) --- contrib/expat/COPYING | 2 +- contrib/expat/Changes | 123 +++++- contrib/expat/Makefile.am | 4 +- contrib/expat/Makefile.in | 4 +- contrib/expat/README.md | 18 +- contrib/expat/configure.ac | 4 +- contrib/expat/doc/reference.html | 9 +- contrib/expat/doc/xmlwf.1 | 2 +- contrib/expat/doc/xmlwf.xml | 4 +- contrib/expat/fuzz/xml_lpm_fuzzer.cpp | 464 ++++++++++++++++++++++ contrib/expat/fuzz/xml_lpm_fuzzer.proto | 58 +++ contrib/expat/fuzz/xml_parse_fuzzer.c | 2 +- contrib/expat/fuzz/xml_parsebuffer_fuzzer.c | 2 +- contrib/expat/lib/expat.h | 6 +- contrib/expat/lib/internal.h | 5 +- contrib/expat/lib/xmlparse.c | 586 ++++++++++++++++++++-------- contrib/expat/tests/acc_tests.c | 5 +- contrib/expat/tests/alloc_tests.c | 27 ++ contrib/expat/tests/basic_tests.c | 331 +++++++++++++++- contrib/expat/tests/benchmark/benchmark.c | 57 ++- contrib/expat/tests/common.c | 33 +- contrib/expat/tests/common.h | 4 +- contrib/expat/tests/handlers.c | 23 ++ contrib/expat/tests/handlers.h | 9 + contrib/expat/tests/minicheck.h | 6 +- contrib/expat/tests/misc_tests.c | 247 ++++++++++-- contrib/expat/tests/xmltest.sh | 5 +- contrib/expat/xmlwf/readfilemap.c | 3 +- 28 files changed, 1779 insertions(+), 264 deletions(-) diff --git a/contrib/expat/COPYING b/contrib/expat/COPYING index ce9e5939291e..c6d184a8aae8 100644 --- a/contrib/expat/COPYING +++ b/contrib/expat/COPYING @@ -1,5 +1,5 @@ Copyright (c) 1998-2000 Thai Open Source Software Center Ltd and Clark Cooper -Copyright (c) 2001-2022 Expat maintainers +Copyright (c) 2001-2025 Expat maintainers Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the diff --git a/contrib/expat/Changes b/contrib/expat/Changes index aa19f70ae219..9d6c64b6a460 100644 --- a/contrib/expat/Changes +++ b/contrib/expat/Changes @@ -11,16 +11,23 @@ !! The following topics need *additional skilled C developers* to progress !! !! in a timely manner or at all (loosely ordered by descending priority): !! !! !! -!! - <blink>fixing a complex non-public security issue</blink>, !! !! - teaming up on researching and fixing future security reports and !! !! ClusterFuzz findings with few-days-max response times in communication !! !! in order to (1) have a sound fix ready before the end of a 90 days !! !! grace period and (2) in a sustainable manner, !! +!! - helping CPython Expat bindings with supporting Expat's billion laughs !! +!! attack protection API (https://github.com/python/cpython/issues/90949): !! +!! - XML_SetBillionLaughsAttackProtectionActivationThreshold !! +!! - XML_SetBillionLaughsAttackProtectionMaximumAmplification !! +!! - helping Perl's XML::Parser Expat bindings with supporting Expat's !! +!! security API (https://github.com/cpan-authors/XML-Parser/issues/102): !! +!! - XML_SetBillionLaughsAttackProtectionActivationThreshold !! +!! - XML_SetBillionLaughsAttackProtectionMaximumAmplification !! +!! - XML_SetReparseDeferralEnabled !! !! - implementing and auto-testing XML 1.0r5 support !! !! (needs discussion before pull requests), !! !! - smart ideas on fixing the Autotools CMake files generation issue !! !! without breaking CI (needs discussion before pull requests), !! -!! - the Windows binaries topic (needs requirements engineering first), !! !! - pushing migration from `int` to `size_t` further !! !! including edge-cases test coverage (needs discussion before anything). !! !! !! @@ -30,6 +37,116 @@ !! THANK YOU! Sebastian Pipping -- Berlin, 2024-03-09 !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +Release 2.7.1 Thu March 27 2025 + Bug fixes: + #980 #989 Restore event pointer behavior from Expat 2.6.4 + (that the fix to CVE-2024-8176 changed in 2.7.0); + affected API functions are: + - XML_GetCurrentByteCount + - XML_GetCurrentByteIndex + - XML_GetCurrentColumnNumber + - XML_GetCurrentLineNumber + - XML_GetInputContext + + Other changes: + #976 #977 Autotools: Integrate files "fuzz/xml_lpm_fuzzer.{cpp,proto}" + with Automake that were missing from 2.7.0 release tarballs + #983 #984 Fix printf format specifiers for 32bit Emscripten + #992 docs: Promote OpenSSF Best Practices self-certification + #978 tests/benchmark: Resolve mistaken double close + #986 Address compiler warnings + #990 #993 Version info bumped from 11:1:10 (libexpat*.so.1.10.1) + to 11:2:10 (libexpat*.so.1.10.2); see https://verbump.de/ + for what these numbers do + + Infrastructure: + #982 CI: Start running Perl XML::Parser integration tests + #987 CI: Enforce Clang Static Analyzer clean code + #991 CI: Re-enable warning clang-analyzer-valist.Uninitialized + for clang-tidy + #981 CI: Cover compilation with musl + #983 #984 CI: Cover compilation with 32bit Emscripten + #976 #977 CI: Protect against fuzzer files missing from future + release archives + + Special thanks to: + Berkay Eren Ürün + Matthew Fernandez + and + Perl XML::Parser + +Release 2.7.0 Thu March 13 2025 + Security fixes: + #893 #973 CVE-2024-8176 -- Fix crash from chaining a large number + of entities caused by stack overflow by resolving use of + recursion, for all three uses of entities: + - general entities in character data ("<e>&g1;</e>") + - general entities in attribute values ("<e k1='&g1;'/>") + - parameter entities ("%p1;") + Known impact is (reliable and easy) denial of service: + CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H/E:H/RL:O/RC:C + (Base Score: 7.5, Temporal Score: 7.2) + Please note that a layer of compression around XML can + significantly reduce the minimum attack payload size. + + Other changes: + #935 #937 Autotools: Make generated CMake files look for + libexpat.@SO_MAJOR@.dylib on macOS + #925 Autotools: Sync CMake templates with CMake 3.29 + #945 #962 #966 CMake: Drop support for CMake <3.13 + #942 CMake: Small fuzzing related improvements + #921 docs: Add missing documentation of error code + XML_ERROR_NOT_STARTED that was introduced with 2.6.4 + #941 docs: Document need for C++11 compiler for use from C++ + #959 tests/benchmark: Fix a (harmless) TOCTTOU + #944 Windows: Fix installer target location of file xmlwf.xml + for CMake + #953 Windows: Address warning -Wunknown-warning-option + about -Wno-pedantic-ms-format from LLVM MinGW + #971 Address Cppcheck warnings + #969 #970 Mass-migrate links from http:// to https:// + #947 #958 .. + #974 #975 Document changes since the previous release + #974 #975 Version info bumped from 11:0:10 (libexpat*.so.1.10.0) + to 11:1:10 (libexpat*.so.1.10.1); see https://verbump.de/ + for what these numbers do + + Infrastructure: + #926 tests: Increase robustness + #927 #932 .. + #930 #933 tests: Increase test coverage + #617 #950 .. + #951 #952 .. + #954 #955 .. Fuzzing: Add new fuzzer "xml_lpm_fuzzer" based on + #961 Google's libprotobuf-mutator ("LPM") + #957 Fuzzing|CI: Start producing fuzzing code coverage reports + #936 CI: Pass -q -q for LCOV >=2.1 in coverage.sh + #942 CI: Small fuzzing related improvements + #139 #203 .. + #791 #946 CI: Make GitHub Actions build using MSVC on Windows and + produce 32bit and 64bit Windows binaries + #956 CI: Get off of about-to-be-removed Ubuntu 20.04 + #960 #964 CI: Start uploading to Coverity Scan for static analysis + #972 CI: Stop loading DTD from the internet to address flaky CI + #971 CI: Adapt to breaking changes in Cppcheck + + Special thanks to: + Alexander Gieringer + Berkay Eren Ürün + Hanno Böck + Jann Horn + Mark Brand + Sebastian Andrzej Siewior + Snild Dolkow + Thomas Pröll + Tomas Korbar + valord577 + and + Google Project Zero + Linutronix + Red Hat + Siemens + Release 2.6.4 Wed November 6 2024 Security fixes: #915 CVE-2024-50602 -- Fix crash within function XML_ResumeParser @@ -46,6 +163,8 @@ Release 2.6.4 Wed November 6 2024 #904 tests: Resolve duplicate handler #317 #918 tests: Improve tests on doctype closing (ex CVE-2019-15903) #914 Fix signedness of format strings + #915 For use from C++, expat.h started requiring C++11 due to + use of C99 features #919 #920 Version info bumped from 10:3:9 (libexpat*.so.1.9.3) to 11:0:10 (libexpat*.so.1.10.0); see https://verbump.de/ for what these numbers do diff --git a/contrib/expat/Makefile.am b/contrib/expat/Makefile.am index 7d8e17c2cf86..c20531a8d6c6 100644 --- a/contrib/expat/Makefile.am +++ b/contrib/expat/Makefile.am @@ -6,7 +6,7 @@ # \___/_/\_\ .__/ \__,_|\__| # |_| XML parser # -# Copyright (c) 2017-2023 Sebastian Pipping <sebastian@pipping.org> +# Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org> # Copyright (c) 2018 KangLin <kl222@126.com> # Copyright (c) 2022 Johnny Jazeix <jazeix@gmail.com> # Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com> @@ -96,6 +96,8 @@ EXTRA_DIST = \ conftools/expat.m4 \ conftools/get-version.sh \ \ + fuzz/xml_lpm_fuzzer.cpp \ + fuzz/xml_lpm_fuzzer.proto \ fuzz/xml_parsebuffer_fuzzer.c \ fuzz/xml_parse_fuzzer.c \ \ diff --git a/contrib/expat/Makefile.in b/contrib/expat/Makefile.in index c0fcb5dd05d1..069ec4047eea 100644 --- a/contrib/expat/Makefile.in +++ b/contrib/expat/Makefile.in @@ -22,7 +22,7 @@ # \___/_/\_\ .__/ \__,_|\__| # |_| XML parser # -# Copyright (c) 2017-2023 Sebastian Pipping <sebastian@pipping.org> +# Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org> # Copyright (c) 2018 KangLin <kl222@126.com> # Copyright (c) 2022 Johnny Jazeix <jazeix@gmail.com> # Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com> @@ -494,6 +494,8 @@ EXTRA_DIST = \ conftools/expat.m4 \ conftools/get-version.sh \ \ + fuzz/xml_lpm_fuzzer.cpp \ + fuzz/xml_lpm_fuzzer.proto \ fuzz/xml_parsebuffer_fuzzer.c \ fuzz/xml_parse_fuzzer.c \ \ diff --git a/contrib/expat/README.md b/contrib/expat/README.md index 23d26dad2b92..77c6bf27d307 100644 --- a/contrib/expat/README.md +++ b/contrib/expat/README.md @@ -3,6 +3,7 @@ [](https://repology.org/metapackage/expat/versions) [](https://sourceforge.net/projects/expat/files/) [](https://github.com/libexpat/libexpat/releases) +[](https://www.bestpractices.dev/projects/10205) > [!CAUTION] > @@ -11,7 +12,7 @@ > at the top of the `Changes` file. -# Expat, Release 2.6.4 +# Expat, Release 2.7.1 This is Expat, a C99 library for parsing [XML 1.0 Fourth Edition](https://www.w3.org/TR/2006/REC-xml-20060816/), started by @@ -22,9 +23,9 @@ are called when the parser discovers the associated structures in the document being parsed. A start tag is an example of the kind of structures for which you may register handlers. -Expat supports the following compilers: +Expat supports the following C99 compilers: -- GNU GCC >=4.5 +- GNU GCC >=4.5 (for use from C) or GNU GCC >=4.8.1 (for use from C++) - LLVM Clang >=3.5 - Microsoft Visual Studio >=16.0/2019 (rolling `${today} minus 5 years`) @@ -52,7 +53,7 @@ This approach leverages CMake's own [module `FindEXPAT`](https://cmake.org/cmake Notice the *uppercase* `EXPAT` in the following example: ```cmake -cmake_minimum_required(VERSION 3.0) # or 3.10, see below +cmake_minimum_required(VERSION 3.10) project(hello VERSION 1.0.0) @@ -62,12 +63,7 @@ add_executable(hello hello.c ) -# a) for CMake >=3.10 (see CMake's FindEXPAT docs) target_link_libraries(hello PUBLIC EXPAT::EXPAT) - -# b) for CMake >=3.0 -target_include_directories(hello PRIVATE ${EXPAT_INCLUDE_DIRS}) -target_link_libraries(hello PUBLIC ${EXPAT_LIBRARIES}) ``` ### b) `find_package` with Config Mode @@ -85,7 +81,7 @@ or Notice the *lowercase* `expat` in the following example: ```cmake -cmake_minimum_required(VERSION 3.0) +cmake_minimum_required(VERSION 3.10) project(hello VERSION 1.0.0) @@ -295,7 +291,7 @@ EXPAT_ENABLE_INSTALL:BOOL=ON // Use /MT flag (static CRT) when compiling in MSVC EXPAT_MSVC_STATIC_CRT:BOOL=OFF -// Build fuzzers via ossfuzz for the expat library +// Build fuzzers via OSS-Fuzz for the expat library EXPAT_OSSFUZZ_BUILD:BOOL=OFF // Build a shared expat library diff --git a/contrib/expat/configure.ac b/contrib/expat/configure.ac index fffcd125e9c4..0c88b8867019 100644 --- a/contrib/expat/configure.ac +++ b/contrib/expat/configure.ac @@ -11,7 +11,7 @@ dnl Copyright (c) 2000 Clark Cooper <coopercc@users.sourceforge.net> dnl Copyright (c) 2000-2005 Fred L. Drake, Jr. <fdrake@users.sourceforge.net> dnl Copyright (c) 2001-2003 Greg Stein <gstein@users.sourceforge.net> dnl Copyright (c) 2006-2012 Karl Waclawek <karl@waclawek.net> -dnl Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org> +dnl Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org> dnl Copyright (c) 2017 S. P. Zeidler <spz@netbsd.org> dnl Copyright (c) 2017 Stephen Groat <stephen@groat.us> dnl Copyright (c) 2017-2020 Joe Orton <jorton@redhat.com> @@ -85,7 +85,7 @@ dnl If the API changes incompatibly set LIBAGE back to 0 dnl LIBCURRENT=11 # sync -LIBREVISION=0 # with +LIBREVISION=2 # with LIBAGE=10 # CMakeLists.txt! AC_CONFIG_HEADERS([expat_config.h]) diff --git a/contrib/expat/doc/reference.html b/contrib/expat/doc/reference.html index c2ae9bb71431..2b3bd39580a9 100644 --- a/contrib/expat/doc/reference.html +++ b/contrib/expat/doc/reference.html @@ -14,7 +14,7 @@ Copyright (c) 2000 Clark Cooper <coopercc@users.sourceforge.net> Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net> Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net> - Copyright (c) 2017-2024 Sebastian Pipping <sebastian@pipping.org> + Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org> Copyright (c) 2017 Jakub Wilk <jwilk@jwilk.net> Copyright (c) 2021 Tomas Korbar <tkorbar@redhat.com> Copyright (c) 2021 Nicolas Cavallari <nicolas.cavallari@green-communications.fr> @@ -52,7 +52,7 @@ <div> <h1> The Expat XML Parser - <small>Release 2.6.4</small> + <small>Release 2.7.1</small> </h1> </div> <div class="content"> @@ -1267,6 +1267,11 @@ call-backs, except when parsing an external parameter entity and <code>XML_STATUS_ERROR</code> otherwise. The possible error codes are:</p> <dl> + <dt><code>XML_ERROR_NOT_STARTED</code></dt> + <dd> + when stopping or suspending a parser before it has started, + added in Expat 2.6.4. + </dd> <dt><code>XML_ERROR_SUSPENDED</code></dt> <dd>when suspending an already suspended parser.</dd> <dt><code>XML_ERROR_FINISHED</code></dt> diff --git a/contrib/expat/doc/xmlwf.1 b/contrib/expat/doc/xmlwf.1 index 61b302581ce9..76aa7e30d074 100644 --- a/contrib/expat/doc/xmlwf.1 +++ b/contrib/expat/doc/xmlwf.1 @@ -5,7 +5,7 @@ \\$2 \(la\\$1\(ra\\$3 .. .if \n(.g .mso www.tmac -.TH XMLWF 1 "November 6, 2024" "" "" +.TH XMLWF 1 "March 27, 2025" "" "" .SH NAME xmlwf \- Determines if an XML document is well-formed .SH SYNOPSIS diff --git a/contrib/expat/doc/xmlwf.xml b/contrib/expat/doc/xmlwf.xml index cf6d984af463..17e9cf51c191 100644 --- a/contrib/expat/doc/xmlwf.xml +++ b/contrib/expat/doc/xmlwf.xml @@ -9,7 +9,7 @@ Copyright (c) 2001 Scott Bronson <bronson@rinspin.com> Copyright (c) 2002-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net> Copyright (c) 2009 Karl Waclawek <karl@waclawek.net> - Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org> + Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org> Copyright (c) 2016 Ardo van Rangelrooij <ardo@debian.org> Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk> Copyright (c) 2020 Joe Orton <jorton@redhat.com> @@ -21,7 +21,7 @@ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ <!ENTITY dhfirstname "<firstname>Scott</firstname>"> <!ENTITY dhsurname "<surname>Bronson</surname>"> - <!ENTITY dhdate "<date>November 6, 2024</date>"> + <!ENTITY dhdate "<date>March 27, 2025</date>"> <!-- Please adjust this^^ date whenever cutting a new release. --> <!ENTITY dhsection "<manvolnum>1</manvolnum>"> <!ENTITY dhemail "<email>bronson@rinspin.com</email>"> diff --git a/contrib/expat/fuzz/xml_lpm_fuzzer.cpp b/contrib/expat/fuzz/xml_lpm_fuzzer.cpp new file mode 100644 index 000000000000..f52ea7b21e40 --- /dev/null +++ b/contrib/expat/fuzz/xml_lpm_fuzzer.cpp @@ -0,0 +1,464 @@ +/* + __ __ _ + ___\ \/ /_ __ __ _| |_ + / _ \\ /| '_ \ / _` | __| + | __// \| |_) | (_| | |_ + \___/_/\_\ .__/ \__,_|\__| + |_| XML parser + + Copyright (c) 2022 Mark Brand <markbrand@google.com> + Copyright (c) 2025 Sebastian Pipping <sebastian@pipping.org> + Licensed under the MIT license: + + Permission is hereby granted, free of charge, to any person obtaining + a copy of this software and associated documentation files (the + "Software"), to deal in the Software without restriction, including + without limitation the rights to use, copy, modify, merge, publish, + distribute, sublicense, and/or sell copies of the Software, and to permit + persons to whom the Software is furnished to do so, subject to the + following conditions: + + The above copyright notice and this permission notice shall be included + in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN + NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, + DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + USE OR OTHER DEALINGS IN THE SOFTWARE. +*/ + +#if defined(NDEBUG) +# undef NDEBUG // because checks below rely on assert(...) +#endif + +#include <assert.h> +#include <stdint.h> +#include <vector> + +#include "expat.h" +#include "xml_lpm_fuzzer.pb.h" +#include "src/libfuzzer/libfuzzer_macro.h" + +static const char *g_encoding = nullptr; +static const char *g_external_entity = nullptr; +static size_t g_external_entity_size = 0; + +void +SetEncoding(const xml_lpm_fuzzer::Encoding &e) { + switch (e) { + case xml_lpm_fuzzer::Encoding::UTF8: + g_encoding = "UTF-8"; + break; + + case xml_lpm_fuzzer::Encoding::UTF16: + g_encoding = "UTF-16"; + break; + + case xml_lpm_fuzzer::Encoding::ISO88591: + g_encoding = "ISO-8859-1"; + break; + + case xml_lpm_fuzzer::Encoding::ASCII: + g_encoding = "US-ASCII"; + break; + + case xml_lpm_fuzzer::Encoding::NONE: + g_encoding = NULL; + break; + + default: + g_encoding = "UNKNOWN"; + break; + } +} + +static int g_allocation_count = 0; +static std::vector<int> g_fail_allocations = {}; + +void * +MallocHook(size_t size) { + g_allocation_count += 1; + for (auto index : g_fail_allocations) { + if (index == g_allocation_count) { + return NULL; + } + } + return malloc(size); +} + +void * +ReallocHook(void *ptr, size_t size) { + g_allocation_count += 1; + for (auto index : g_fail_allocations) { + if (index == g_allocation_count) { + return NULL; + } + } + return realloc(ptr, size); +} + +void +FreeHook(void *ptr) { + free(ptr); +} + +XML_Memory_Handling_Suite memory_handling_suite + = {MallocHook, ReallocHook, FreeHook}; + +void InitializeParser(XML_Parser parser); + +// We want a parse function that supports resumption, so that we can cover the +// suspend/resume code. +enum XML_Status +Parse(XML_Parser parser, const char *input, int input_len, int is_final) { + enum XML_Status status = XML_Parse(parser, input, input_len, is_final); + while (status == XML_STATUS_SUSPENDED) { + status = XML_ResumeParser(parser); + } + return status; +} + +// When the fuzzer is compiled with instrumentation such as ASan, then the +// accesses in TouchString will fault if they access invalid memory (ie. detect +// either a use-after-free or buffer-overflow). By calling TouchString in each +// of the callbacks, we can check that the arguments meet the API specifications +// in terms of length/null-termination. no_optimize is used to ensure that the +// compiler has to emit actual memory reads, instead of removing them. +static volatile size_t no_optimize = 0; +static void +TouchString(const XML_Char *ptr, int len = -1) { + if (! ptr) { + return; + } + + if (len == -1) { + for (XML_Char value = *ptr++; value; value = *ptr++) { + no_optimize += value; + } + } else { + for (int i = 0; i < len; ++i) { + no_optimize += ptr[i]; + } + } +} + +static void +TouchNodeAndRecurse(XML_Content *content) { + switch (content->type) { + case XML_CTYPE_EMPTY: + case XML_CTYPE_ANY: + assert(content->quant == XML_CQUANT_NONE); + assert(content->name == NULL); + assert(content->numchildren == 0); + assert(content->children == NULL); + break; + + case XML_CTYPE_MIXED: + assert(content->quant == XML_CQUANT_NONE + || content->quant == XML_CQUANT_REP); + assert(content->name == NULL); + for (unsigned int i = 0; i < content->numchildren; ++i) { + assert(content->children[i].type == XML_CTYPE_NAME); + assert(content->children[i].quant == XML_CQUANT_NONE); + assert(content->children[i].numchildren == 0); + assert(content->children[i].children == NULL); + TouchString(content->children[i].name); + } + break; + + case XML_CTYPE_NAME: + assert((content->quant == XML_CQUANT_NONE) + || (content->quant == XML_CQUANT_OPT) + || (content->quant == XML_CQUANT_REP) + || (content->quant == XML_CQUANT_PLUS)); + assert(content->numchildren == 0); + assert(content->children == NULL); + TouchString(content->name); + break; + + case XML_CTYPE_CHOICE: + case XML_CTYPE_SEQ: + assert((content->quant == XML_CQUANT_NONE) + || (content->quant == XML_CQUANT_OPT) + || (content->quant == XML_CQUANT_REP) + || (content->quant == XML_CQUANT_PLUS)); + assert(content->name == NULL); + for (unsigned int i = 0; i < content->numchildren; ++i) { + TouchNodeAndRecurse(&content->children[i]); + } + break; + + default: + assert(false); + } +} + +static void XMLCALL +ElementDeclHandler(void *userData, const XML_Char *name, XML_Content *model) { + TouchString(name); + TouchNodeAndRecurse(model); + XML_FreeContentModel((XML_Parser)userData, model); +} + +static void XMLCALL +AttlistDeclHandler(void *userData, const XML_Char *elname, + const XML_Char *attname, const XML_Char *atttype, + const XML_Char *dflt, int isrequired) { + (void)userData; + TouchString(elname); + TouchString(attname); + TouchString(atttype); + TouchString(dflt); + (void)isrequired; +} + +static void XMLCALL +XmlDeclHandler(void *userData, const XML_Char *version, + const XML_Char *encoding, int standalone) { + (void)userData; + TouchString(version); + TouchString(encoding); + (void)standalone; +} + +static void XMLCALL +StartElementHandler(void *userData, const XML_Char *name, + const XML_Char **atts) { + (void)userData; + TouchString(name); + for (size_t i = 0; atts[i] != NULL; ++i) { + TouchString(atts[i]); + } +} + +static void XMLCALL +EndElementHandler(void *userData, const XML_Char *name) { + (void)userData; + TouchString(name); +} + +static void XMLCALL +CharacterDataHandler(void *userData, const XML_Char *s, int len) { + (void)userData; + TouchString(s, len); +} + +static void XMLCALL +ProcessingInstructionHandler(void *userData, const XML_Char *target, + const XML_Char *data) { + (void)userData; + TouchString(target); + TouchString(data); +} + +static void XMLCALL +CommentHandler(void *userData, const XML_Char *data) { + TouchString(data); + // Use the comment handler to trigger parser suspend, so that we can get + // coverage of that code. + XML_StopParser((XML_Parser)userData, XML_TRUE); +} + +static void XMLCALL +StartCdataSectionHandler(void *userData) { + (void)userData; +} + +static void XMLCALL +EndCdataSectionHandler(void *userData) { + (void)userData; +} + +static void XMLCALL +DefaultHandler(void *userData, const XML_Char *s, int len) { + (void)userData; + TouchString(s, len); +} + +static void XMLCALL +StartDoctypeDeclHandler(void *userData, const XML_Char *doctypeName, + const XML_Char *sysid, const XML_Char *pubid, + int has_internal_subset) { + (void)userData; + TouchString(doctypeName); + TouchString(sysid); + TouchString(pubid); + (void)has_internal_subset; +} + +static void XMLCALL +EndDoctypeDeclHandler(void *userData) { + (void)userData; +} + +static void XMLCALL +EntityDeclHandler(void *userData, const XML_Char *entityName, + int is_parameter_entity, const XML_Char *value, + int value_length, const XML_Char *base, + const XML_Char *systemId, const XML_Char *publicId, + const XML_Char *notationName) { + (void)userData; + TouchString(entityName); + (void)is_parameter_entity; + TouchString(value, value_length); + TouchString(base); + TouchString(systemId); + TouchString(publicId); + TouchString(notationName); +} + +static void XMLCALL +NotationDeclHandler(void *userData, const XML_Char *notationName, + const XML_Char *base, const XML_Char *systemId, + const XML_Char *publicId) { + (void)userData; + TouchString(notationName); + TouchString(base); + TouchString(systemId); + TouchString(publicId); +} + +static void XMLCALL +StartNamespaceDeclHandler(void *userData, const XML_Char *prefix, + const XML_Char *uri) { + (void)userData; + TouchString(prefix); + TouchString(uri); +} + +static void XMLCALL +EndNamespaceDeclHandler(void *userData, const XML_Char *prefix) { + (void)userData; + TouchString(prefix); +} + +static int XMLCALL +NotStandaloneHandler(void *userData) { + (void)userData; + return XML_STATUS_OK; +} + +static int XMLCALL +ExternalEntityRefHandler(XML_Parser parser, const XML_Char *context, + const XML_Char *base, const XML_Char *systemId, + const XML_Char *publicId) { + int rc = XML_STATUS_ERROR; + TouchString(context); + TouchString(base); + TouchString(systemId); + TouchString(publicId); + + if (g_external_entity) { + XML_Parser ext_parser + = XML_ExternalEntityParserCreate(parser, context, g_encoding); + rc = Parse(ext_parser, g_external_entity, g_external_entity_size, 1); + XML_ParserFree(ext_parser); + } + + return rc; +} + +static void XMLCALL +SkippedEntityHandler(void *userData, const XML_Char *entityName, + int is_parameter_entity) { + (void)userData; + TouchString(entityName); + (void)is_parameter_entity; +} + +static int XMLCALL +UnknownEncodingHandler(void *encodingHandlerData, const XML_Char *name, + XML_Encoding *info) { + (void)encodingHandlerData; + TouchString(name); + (void)info; + return XML_STATUS_ERROR; +} + +void +InitializeParser(XML_Parser parser) { + XML_SetUserData(parser, (void *)parser); + XML_SetHashSalt(parser, 0x41414141); + XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS); + + XML_SetElementDeclHandler(parser, ElementDeclHandler); + XML_SetAttlistDeclHandler(parser, AttlistDeclHandler); + XML_SetXmlDeclHandler(parser, XmlDeclHandler); + XML_SetElementHandler(parser, StartElementHandler, EndElementHandler); + XML_SetCharacterDataHandler(parser, CharacterDataHandler); + XML_SetProcessingInstructionHandler(parser, ProcessingInstructionHandler); + XML_SetCommentHandler(parser, CommentHandler); + XML_SetCdataSectionHandler(parser, StartCdataSectionHandler, + EndCdataSectionHandler); + // XML_SetDefaultHandler disables entity expansion + XML_SetDefaultHandlerExpand(parser, DefaultHandler); + XML_SetDoctypeDeclHandler(parser, StartDoctypeDeclHandler, + EndDoctypeDeclHandler); + // Note: This is mutually exclusive with XML_SetUnparsedEntityDeclHandler, + // and there isn't any significant code change between the two. + XML_SetEntityDeclHandler(parser, EntityDeclHandler); + XML_SetNotationDeclHandler(parser, NotationDeclHandler); + XML_SetNamespaceDeclHandler(parser, StartNamespaceDeclHandler, + EndNamespaceDeclHandler); + XML_SetNotStandaloneHandler(parser, NotStandaloneHandler); + XML_SetExternalEntityRefHandler(parser, ExternalEntityRefHandler); + XML_SetSkippedEntityHandler(parser, SkippedEntityHandler); + XML_SetUnknownEncodingHandler(parser, UnknownEncodingHandler, (void *)parser); +} + +DEFINE_TEXT_PROTO_FUZZER(const xml_lpm_fuzzer::Testcase &testcase) { + g_external_entity = nullptr; + + if (! testcase.actions_size()) { + return; + } + + g_allocation_count = 0; + g_fail_allocations.clear(); + for (int i = 0; i < testcase.fail_allocations_size(); ++i) { + g_fail_allocations.push_back(testcase.fail_allocations(i)); + } + + SetEncoding(testcase.encoding()); + XML_Parser parser + = XML_ParserCreate_MM(g_encoding, &memory_handling_suite, "|"); + InitializeParser(parser); + + for (int i = 0; i < testcase.actions_size(); ++i) { + const auto &action = testcase.actions(i); + switch (action.action_case()) { + case xml_lpm_fuzzer::Action::kChunk: + if (XML_STATUS_ERROR + == Parse(parser, action.chunk().data(), action.chunk().size(), 0)) { + // Force a reset after parse error. + XML_ParserReset(parser, g_encoding); + InitializeParser(parser); + } + break; + + case xml_lpm_fuzzer::Action::kLastChunk: + Parse(parser, action.last_chunk().data(), action.last_chunk().size(), 1); + XML_ParserReset(parser, g_encoding); + InitializeParser(parser); + break; + + case xml_lpm_fuzzer::Action::kReset: + XML_ParserReset(parser, g_encoding); + InitializeParser(parser); + break; + + case xml_lpm_fuzzer::Action::kExternalEntity: + g_external_entity = action.external_entity().data(); + g_external_entity_size = action.external_entity().size(); + break; + + default: + break; + } + } + + XML_ParserFree(parser); +} diff --git a/contrib/expat/fuzz/xml_lpm_fuzzer.proto b/contrib/expat/fuzz/xml_lpm_fuzzer.proto new file mode 100644 index 000000000000..ddc4e958b919 --- /dev/null +++ b/contrib/expat/fuzz/xml_lpm_fuzzer.proto @@ -0,0 +1,58 @@ +/* + __ __ _ + ___\ \/ /_ __ __ _| |_ + / _ \\ /| '_ \ / _` | __| + | __// \| |_) | (_| | |_ + \___/_/\_\ .__/ \__,_|\__| + |_| XML parser + + Copyright (c) 2022 Mark Brand <markbrand@google.com> + Copyright (c) 2025 Sebastian Pipping <sebastian@pipping.org> + Licensed under the MIT license: + + Permission is hereby granted, free of charge, to any person obtaining + a copy of this software and associated documentation files (the + "Software"), to deal in the Software without restriction, including + without limitation the rights to use, copy, modify, merge, publish, + distribute, sublicense, and/or sell copies of the Software, and to permit + persons to whom the Software is furnished to do so, subject to the + following conditions: + + The above copyright notice and this permission notice shall be included + in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN + NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, + DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + USE OR OTHER DEALINGS IN THE SOFTWARE. +*/ + +syntax = "proto2"; +package xml_lpm_fuzzer; + +enum Encoding { + UTF8 = 0; + UTF16 = 1; + ISO88591 = 2; + ASCII = 3; + UNKNOWN = 4; + NONE = 5; +} + +message Action { + oneof action { + string chunk = 1; + string last_chunk = 2; + bool reset = 3; + string external_entity = 4; + } +} + +message Testcase { + required Encoding encoding = 1; + repeated Action actions = 2; + repeated int32 fail_allocations = 3; +} diff --git a/contrib/expat/fuzz/xml_parse_fuzzer.c b/contrib/expat/fuzz/xml_parse_fuzzer.c index a7e8414ce355..6a1affe2b1f6 100644 --- a/contrib/expat/fuzz/xml_parse_fuzzer.c +++ b/contrib/expat/fuzz/xml_parse_fuzzer.c @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, diff --git a/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c b/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c index 0327aa9f952e..cfc4af202851 100644 --- a/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c +++ b/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, diff --git a/contrib/expat/lib/expat.h b/contrib/expat/lib/expat.h index 523b37d8d578..610e1ddc0e94 100644 *** 2251 LINES SKIPPED ***