svn commit: r339606 - in head: lib/libzstd sys/conf sys/contrib/zstd sys/contrib/zstd/contrib/gen_html sys/contrib/zstd/contrib/meson sys/contrib/zstd/contrib/pzstd sys/contrib/zstd/contrib/seekabl...
Rodney W. Grimes
freebsd at pdx.rh.CN85.dnsmgr.net
Mon Oct 22 18:56:48 UTC 2018
[ Charset UTF-8 unsupported, converting... ]
> Author: cem
> Date: Mon Oct 22 18:29:12 2018
> New Revision: 339606
> URL: https://svnweb.freebsd.org/changeset/base/339606
>
> Log:
> Update to Zstandard 1.3.7
>
> Relnotes: yes
> Sponsored by: Dell EMC Isilon
MFC after: 1+month?
> Added:
> head/sys/contrib/zstd/doc/images/cdict_v136.png (contents, props changed)
> head/sys/contrib/zstd/doc/images/zstd_cdict_v1_3_5.png (contents, props changed)
> head/sys/contrib/zstd/lib/common/debug.c (contents, props changed)
> head/sys/contrib/zstd/lib/common/debug.h (contents, props changed)
> head/sys/contrib/zstd/lib/compress/hist.c (contents, props changed)
> head/sys/contrib/zstd/lib/compress/hist.h (contents, props changed)
> head/sys/contrib/zstd/lib/dictBuilder/cover.h (contents, props changed)
> head/sys/contrib/zstd/lib/dictBuilder/fastcover.c (contents, props changed)
> head/sys/contrib/zstd/programs/zstdgrep.1 (contents, props changed)
> head/sys/contrib/zstd/programs/zstdgrep.1.md
> head/sys/contrib/zstd/programs/zstdless.1 (contents, props changed)
> head/sys/contrib/zstd/programs/zstdless.1.md
> head/sys/contrib/zstd/tests/libzstd_partial_builds.sh (contents, props changed)
> head/sys/contrib/zstd/tests/rateLimiter.py (contents, props changed)
> Deleted:
> head/sys/contrib/zstd/circle.yml
> head/sys/contrib/zstd/tests/namespaceTest.c
> Modified:
> head/lib/libzstd/Makefile
> head/sys/conf/files
> head/sys/conf/files.sparc64
> head/sys/contrib/zstd/.gitattributes
> head/sys/contrib/zstd/Makefile
> head/sys/contrib/zstd/NEWS
> head/sys/contrib/zstd/README.md
> head/sys/contrib/zstd/TESTING.md
> head/sys/contrib/zstd/appveyor.yml
> head/sys/contrib/zstd/contrib/gen_html/Makefile
> head/sys/contrib/zstd/contrib/meson/meson.build
> head/sys/contrib/zstd/contrib/pzstd/Makefile
> head/sys/contrib/zstd/contrib/pzstd/Options.cpp
> head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp
> head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile
> head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c
> head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c
> head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h
> head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c
> head/sys/contrib/zstd/doc/zstd_compression_format.md
> head/sys/contrib/zstd/doc/zstd_manual.html
> head/sys/contrib/zstd/lib/BUCK
> head/sys/contrib/zstd/lib/Makefile
> head/sys/contrib/zstd/lib/README.md
> head/sys/contrib/zstd/lib/common/bitstream.h
> head/sys/contrib/zstd/lib/common/compiler.h
> head/sys/contrib/zstd/lib/common/cpu.h
> head/sys/contrib/zstd/lib/common/entropy_common.c
> head/sys/contrib/zstd/lib/common/fse.h
> head/sys/contrib/zstd/lib/common/fse_decompress.c
> head/sys/contrib/zstd/lib/common/huf.h
> head/sys/contrib/zstd/lib/common/mem.h
> head/sys/contrib/zstd/lib/common/pool.c
> head/sys/contrib/zstd/lib/common/pool.h
> head/sys/contrib/zstd/lib/common/xxhash.c
> head/sys/contrib/zstd/lib/common/zstd_common.c
> head/sys/contrib/zstd/lib/common/zstd_internal.h
> head/sys/contrib/zstd/lib/compress/fse_compress.c
> head/sys/contrib/zstd/lib/compress/huf_compress.c
> head/sys/contrib/zstd/lib/compress/zstd_compress.c
> head/sys/contrib/zstd/lib/compress/zstd_compress_internal.h
> head/sys/contrib/zstd/lib/compress/zstd_double_fast.c
> head/sys/contrib/zstd/lib/compress/zstd_double_fast.h
> head/sys/contrib/zstd/lib/compress/zstd_fast.c
> head/sys/contrib/zstd/lib/compress/zstd_fast.h
> head/sys/contrib/zstd/lib/compress/zstd_lazy.c
> head/sys/contrib/zstd/lib/compress/zstd_lazy.h
> head/sys/contrib/zstd/lib/compress/zstd_ldm.c
> head/sys/contrib/zstd/lib/compress/zstd_ldm.h
> head/sys/contrib/zstd/lib/compress/zstd_opt.c
> head/sys/contrib/zstd/lib/compress/zstd_opt.h
> head/sys/contrib/zstd/lib/compress/zstdmt_compress.c
> head/sys/contrib/zstd/lib/compress/zstdmt_compress.h
> head/sys/contrib/zstd/lib/decompress/huf_decompress.c
> head/sys/contrib/zstd/lib/decompress/zstd_decompress.c
> head/sys/contrib/zstd/lib/dictBuilder/cover.c
> head/sys/contrib/zstd/lib/dictBuilder/divsufsort.c
> head/sys/contrib/zstd/lib/dictBuilder/zdict.c
> head/sys/contrib/zstd/lib/dictBuilder/zdict.h
> head/sys/contrib/zstd/lib/freebsd/zstd_kmalloc.c
> head/sys/contrib/zstd/lib/legacy/zstd_v01.c
> head/sys/contrib/zstd/lib/legacy/zstd_v02.c
> head/sys/contrib/zstd/lib/legacy/zstd_v03.c
> head/sys/contrib/zstd/lib/legacy/zstd_v04.c
> head/sys/contrib/zstd/lib/legacy/zstd_v05.c
> head/sys/contrib/zstd/lib/legacy/zstd_v06.c
> head/sys/contrib/zstd/lib/legacy/zstd_v07.c
> head/sys/contrib/zstd/lib/zstd.h
> head/sys/contrib/zstd/programs/Makefile
> head/sys/contrib/zstd/programs/README.md
> head/sys/contrib/zstd/programs/bench.c
> head/sys/contrib/zstd/programs/bench.h
> head/sys/contrib/zstd/programs/datagen.c
> head/sys/contrib/zstd/programs/dibio.c
> head/sys/contrib/zstd/programs/dibio.h
> head/sys/contrib/zstd/programs/fileio.c
> head/sys/contrib/zstd/programs/fileio.h
> head/sys/contrib/zstd/programs/platform.h
> head/sys/contrib/zstd/programs/util.h
> head/sys/contrib/zstd/programs/zstd.1
> head/sys/contrib/zstd/programs/zstd.1.md
> head/sys/contrib/zstd/programs/zstdcli.c
> head/sys/contrib/zstd/tests/.gitignore
> head/sys/contrib/zstd/tests/Makefile
> head/sys/contrib/zstd/tests/README.md
> head/sys/contrib/zstd/tests/decodecorpus.c
> head/sys/contrib/zstd/tests/fullbench.c
> head/sys/contrib/zstd/tests/fuzz/fuzz.h
> head/sys/contrib/zstd/tests/fuzz/fuzz.py
> head/sys/contrib/zstd/tests/fuzz/regression_driver.c
> head/sys/contrib/zstd/tests/fuzz/zstd_helpers.c
> head/sys/contrib/zstd/tests/fuzzer.c
> head/sys/contrib/zstd/tests/gzip/Makefile
> head/sys/contrib/zstd/tests/legacy.c
> head/sys/contrib/zstd/tests/longmatch.c
> head/sys/contrib/zstd/tests/paramgrill.c
> head/sys/contrib/zstd/tests/playTests.sh
> head/sys/contrib/zstd/tests/poolTests.c
> head/sys/contrib/zstd/tests/roundTripCrash.c
> head/sys/contrib/zstd/tests/symbols.c
> head/sys/contrib/zstd/tests/test-zstd-versions.py
> head/sys/contrib/zstd/tests/zstreamtest.c
> head/sys/contrib/zstd/zlibWrapper/examples/minigzip.c
> head/sys/contrib/zstd/zlibWrapper/examples/zwrapbench.c
> head/sys/contrib/zstd/zlibWrapper/gzguts.h
> head/sys/contrib/zstd/zlibWrapper/gzlib.c
> head/sys/contrib/zstd/zlibWrapper/gzwrite.c
>
> Modified: head/lib/libzstd/Makefile
> ==============================================================================
> --- head/lib/libzstd/Makefile Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/lib/libzstd/Makefile Mon Oct 22 18:29:12 2018 (r339606)
> @@ -24,7 +24,10 @@ SRCS= entropy_common.c \
> zstd_lazy.c \
> zstd_ldm.c \
> zstd_opt.c \
> - zstd_double_fast.c
> + zstd_double_fast.c \
> + debug.c \
> + hist.c \
> + fastcover.c
> WARNS= 2
> INCS= zstd.h
> CFLAGS+= -I${ZSTDDIR}/lib -I${ZSTDDIR}/lib/common -DXXH_NAMESPACE=ZSTD_ \
>
> Modified: head/sys/conf/files
> ==============================================================================
> --- head/sys/conf/files Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/conf/files Mon Oct 22 18:29:12 2018 (r339606)
> @@ -645,6 +645,7 @@ contrib/zstd/lib/common/error_private.c optional zstd
> contrib/zstd/lib/common/xxhash.c optional zstdio compile-with ${ZSTD_C}
> contrib/zstd/lib/compress/zstd_compress.c optional zstdio compile-with ${ZSTD_C}
> contrib/zstd/lib/compress/fse_compress.c optional zstdio compile-with ${ZSTD_C}
> +contrib/zstd/lib/compress/hist.c optional zstdio compile-with ${ZSTD_C}
> contrib/zstd/lib/compress/huf_compress.c optional zstdio compile-with ${ZSTD_C}
> contrib/zstd/lib/compress/zstd_double_fast.c optional zstdio compile-with ${ZSTD_C}
> contrib/zstd/lib/compress/zstd_fast.c optional zstdio compile-with ${ZSTD_C}
>
> Modified: head/sys/conf/files.sparc64
> ==============================================================================
> --- head/sys/conf/files.sparc64 Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/conf/files.sparc64 Mon Oct 22 18:29:12 2018 (r339606)
> @@ -149,3 +149,6 @@ sparc64/sparc64/uio_machdep.c standard
> sparc64/sparc64/upa.c optional creator
> sparc64/sparc64/vm_machdep.c standard
> sparc64/sparc64/zeus.c standard
> +
> +# Zstd
> +contrib/zstd/lib/freebsd/zstd_kfreebsd.c optional zstdio compile-with ${ZSTD_C}
>
> Modified: head/sys/contrib/zstd/.gitattributes
> ==============================================================================
> --- head/sys/contrib/zstd/.gitattributes Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/.gitattributes Mon Oct 22 18:29:12 2018 (r339606)
> @@ -19,6 +19,3 @@
> # Windows
> *.bat text eol=crlf
> *.cmd text eol=crlf
> -
> -# .travis.yml merging
> -.travis.yml merge=ours
>
> Modified: head/sys/contrib/zstd/Makefile
> ==============================================================================
> --- head/sys/contrib/zstd/Makefile Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/Makefile Mon Oct 22 18:29:12 2018 (r339606)
> @@ -23,20 +23,19 @@ else
> EXT =
> endif
>
> +## default: Build lib-release and zstd-release
> .PHONY: default
> default: lib-release zstd-release
>
> .PHONY: all
> -all: | allmost examples manual contrib
> +all: allmost examples manual contrib
>
> .PHONY: allmost
> -allmost: allzstd
> - $(MAKE) -C $(ZWRAPDIR) all
> +allmost: allzstd zlibwrapper
>
> -#skip zwrapper, can't build that on alternate architectures without the proper zlib installed
> +# skip zwrapper, can't build that on alternate architectures without the proper zlib installed
> .PHONY: allzstd
> -allzstd:
> - $(MAKE) -C $(ZSTDDIR) all
> +allzstd: lib
> $(MAKE) -C $(PRGDIR) all
> $(MAKE) -C $(TESTDIR) all
>
> @@ -45,58 +44,62 @@ all32:
> $(MAKE) -C $(PRGDIR) zstd32
> $(MAKE) -C $(TESTDIR) all32
>
> -.PHONY: lib
> -lib:
> +.PHONY: lib lib-release libzstd.a
> +lib lib-release :
> @$(MAKE) -C $(ZSTDDIR) $@
>
> -.PHONY: lib-release
> -lib-release:
> - @$(MAKE) -C $(ZSTDDIR)
> -
> -.PHONY: zstd
> -zstd:
> +.PHONY: zstd zstd-release
> +zstd zstd-release:
> @$(MAKE) -C $(PRGDIR) $@
> cp $(PRGDIR)/zstd$(EXT) .
>
> -.PHONY: zstd-release
> -zstd-release:
> - @$(MAKE) -C $(PRGDIR)
> - cp $(PRGDIR)/zstd$(EXT) .
> -
> .PHONY: zstdmt
> zstdmt:
> @$(MAKE) -C $(PRGDIR) $@
> cp $(PRGDIR)/zstd$(EXT) ./zstdmt$(EXT)
>
> .PHONY: zlibwrapper
> -zlibwrapper:
> - $(MAKE) -C $(ZWRAPDIR) test
> +zlibwrapper: lib
> + $(MAKE) -C $(ZWRAPDIR) all
>
> +## test: run long-duration tests
> .PHONY: test
> +test: MOREFLAGS += -g -DDEBUGLEVEL=1 -Werror
> test:
> - $(MAKE) -C $(PRGDIR) allVariants MOREFLAGS+="-g -DZSTD_DEBUG=1"
> + MOREFLAGS="$(MOREFLAGS)" $(MAKE) -j -C $(PRGDIR) allVariants
> $(MAKE) -C $(TESTDIR) $@
>
> +## shortest: same as `make check`
> .PHONY: shortest
> shortest:
> $(MAKE) -C $(TESTDIR) $@
>
> +## check: run basic tests for `zstd` cli
> .PHONY: check
> check: shortest
>
> +## examples: build all examples in `/examples` directory
> .PHONY: examples
> -examples:
> +examples: lib
> CPPFLAGS=-I../lib LDFLAGS=-L../lib $(MAKE) -C examples/ all
>
> +## manual: generate API documentation in html format
> .PHONY: manual
> manual:
> $(MAKE) -C contrib/gen_html $@
>
> +## man: generate man page
> +.PHONY: man
> +man:
> + $(MAKE) -C programs $@
> +
> +## contrib: build all supported projects in `/contrib` directory
> .PHONY: contrib
> contrib: lib
> $(MAKE) -C contrib/pzstd all
> $(MAKE) -C contrib/seekable_format/examples all
> $(MAKE) -C contrib/adaptive-compression all
> + $(MAKE) -C contrib/largeNbDicts all
>
> .PHONY: cleanTabs
> cleanTabs:
> @@ -113,21 +116,39 @@ clean:
> @$(MAKE) -C contrib/pzstd $@ > $(VOID)
> @$(MAKE) -C contrib/seekable_format/examples $@ > $(VOID)
> @$(MAKE) -C contrib/adaptive-compression $@ > $(VOID)
> + @$(MAKE) -C contrib/largeNbDicts $@ > $(VOID)
> @$(RM) zstd$(EXT) zstdmt$(EXT) tmp*
> @$(RM) -r lz4
> @echo Cleaning completed
>
> #------------------------------------------------------------------------------
> -# make install is validated only for Linux, OSX, Hurd and some BSD targets
> +# make install is validated only for Linux, macOS, Hurd and some BSD targets
> #------------------------------------------------------------------------------
> -ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU FreeBSD DragonFly NetBSD MSYS_NT))
> +ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU OpenBSD FreeBSD DragonFly NetBSD MSYS_NT Haiku))
>
> HOST_OS = POSIX
> -CMAKE_PARAMS = -DZSTD_BUILD_CONTRIB:BOOL=ON -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_ZLIB_SUPPORT:BOOL=ON -DZSTD_LZMA_SUPPORT:BOOL=ON
> +CMAKE_PARAMS = -DZSTD_BUILD_CONTRIB:BOOL=ON -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_ZLIB_SUPPORT:BOOL=ON -DZSTD_LZMA_SUPPORT:BOOL=ON -DCMAKE_BUILD_TYPE=Release
>
> +EGREP = egrep --color=never
> +
> +# Print a two column output of targets and their description. To add a target description, put a
> +# comment in the Makefile with the format "## <TARGET>: <DESCRIPTION>". For example:
> +#
> +## list: Print all targets and their descriptions (if provided)
> .PHONY: list
> list:
> - @$(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' | sort | egrep -v -e '^[^[:alnum:]]' -e '^$@$$' | xargs
> + @TARGETS=$$($(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null \
> + | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' \
> + | $(EGREP) -v -e '^[^[:alnum:]]' | sort); \
> + { \
> + printf "Target Name\tDescription\n"; \
> + printf "%0.s-" {1..16}; printf "\t"; printf "%0.s-" {1..40}; printf "\n"; \
> + for target in $$TARGETS; do \
> + line=$$($(EGREP) "^##[[:space:]]+$$target:" $(lastword $(MAKEFILE_LIST))); \
> + description=$$(echo $$line | awk '{i=index($$0,":"); print substr($$0,i+1)}' | xargs); \
> + printf "$$target\t$$description\n"; \
> + done \
> + } | column -t -s $$'\t'
>
> .PHONY: install clangtest armtest usan asan uasan
> install:
> @@ -183,6 +204,7 @@ armfuzz: clean
> CC=arm-linux-gnueabi-gcc QEMU_SYS=qemu-arm-static MOREFLAGS="-static" FUZZER_FLAGS=--no-big-tests $(MAKE) -C $(TESTDIR) fuzztest
>
> aarch64fuzz: clean
> + ld -v
> CC=aarch64-linux-gnu-gcc QEMU_SYS=qemu-aarch64-static MOREFLAGS="-static" FUZZER_FLAGS=--no-big-tests $(MAKE) -C $(TESTDIR) fuzztest
>
> ppcfuzz: clean
> @@ -206,7 +228,7 @@ gcc6test: clean
>
> clangtest: clean
> clang -v
> - $(MAKE) all CXX=clang-++ CC=clang MOREFLAGS="-Werror -Wconversion -Wno-sign-conversion -Wdocumentation"
> + $(MAKE) all CXX=clang++ CC=clang MOREFLAGS="-Werror -Wconversion -Wno-sign-conversion -Wdocumentation"
>
> armtest: clean
> $(MAKE) -C $(TESTDIR) datagen # use native, faster
> @@ -295,6 +317,9 @@ gcc6install: apt-add-repo
> gcc7install: apt-add-repo
> APT_PACKAGES="libc6-dev-i386 gcc-multilib gcc-7 gcc-7-multilib" $(MAKE) apt-install
>
> +gcc8install: apt-add-repo
> + APT_PACKAGES="libc6-dev-i386 gcc-multilib gcc-8 gcc-8-multilib" $(MAKE) apt-install
> +
> gpp6install: apt-add-repo
> APT_PACKAGES="libc6-dev-i386 g++-multilib gcc-6 g++-6 g++-6-multilib" $(MAKE) apt-install
>
> @@ -326,23 +351,23 @@ cmakebuild:
>
> c90build: clean
> $(CC) -v
> - CFLAGS="-std=c90" $(MAKE) allmost # will fail, due to missing support for `long long`
> + CFLAGS="-std=c90 -Werror" $(MAKE) allmost # will fail, due to missing support for `long long`
>
> gnu90build: clean
> $(CC) -v
> - CFLAGS="-std=gnu90" $(MAKE) allmost
> + CFLAGS="-std=gnu90 -Werror" $(MAKE) allmost
>
> c99build: clean
> $(CC) -v
> - CFLAGS="-std=c99" $(MAKE) allmost
> + CFLAGS="-std=c99 -Werror" $(MAKE) allmost
>
> gnu99build: clean
> $(CC) -v
> - CFLAGS="-std=gnu99" $(MAKE) allmost
> + CFLAGS="-std=gnu99 -Werror" $(MAKE) allmost
>
> c11build: clean
> $(CC) -v
> - CFLAGS="-std=c11" $(MAKE) allmost
> + CFLAGS="-std=c11 -Werror" $(MAKE) allmost
>
> bmix64build: clean
> $(CC) -v
> @@ -356,7 +381,10 @@ bmi32build: clean
> $(CC) -v
> CFLAGS="-O3 -mbmi -m32 -Werror" $(MAKE) -C $(TESTDIR) test
>
> -staticAnalyze: clean
> +# static analyzer test uses clang's scan-build
> +# does not analyze zlibWrapper, due to detected issues in zlib source code
> +staticAnalyze: SCANBUILD ?= scan-build
> +staticAnalyze:
> $(CC) -v
> - CPPFLAGS=-g scan-build --status-bugs -v $(MAKE) all
> + CC=$(CC) CPPFLAGS=-g $(SCANBUILD) --status-bugs -v $(MAKE) allzstd examples contrib
> endif
>
> Modified: head/sys/contrib/zstd/NEWS
> ==============================================================================
> --- head/sys/contrib/zstd/NEWS Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/NEWS Mon Oct 22 18:29:12 2018 (r339606)
> @@ -1,3 +1,39 @@
> +v1.3.7
> +perf: slightly better decompression speed on clang (depending on hardware target)
> +fix : performance of dictionary compression for small input < 4 KB at levels 9 and 10
> +build: no longer build backtrace by default in release mode; restrict further automatic mode
> +build: control backtrace support through build macro BACKTRACE
> +misc: added man pages for zstdless and zstdgrep, by @samrussell
> +
> +v1.3.6
> +perf: much faster dictionary builder, by @jenniferliu
> +perf: faster dictionary compression on small data when using multiple contexts, by @felixhandte
> +perf: faster dictionary decompression when using a very large number of dictionaries simultaneously
> +cli : fix : does no longer overwrite destination when source does not exist (#1082)
> +cli : new command --adapt, for automatic compression level adaptation
> +api : fix : block api can be streamed with > 4 GB, reported by @catid
> +api : reduced ZSTD_DDict size by 2 KB
> +api : minimum negative compression level is defined, and can be queried using ZSTD_minCLevel().
> +build: support Haiku target, by @korli
> +build: Read Legacy format is limited to v0.5+ by default. Can be changed at compile time with macro ZSTD_LEGACY_SUPPORT.
> +doc : zstd_compression_format.md updated to match wording in IETF RFC 8478
> +misc: tests/paramgrill, a parameter optimizer, by @GeorgeLu97
> +
> +v1.3.5
> +perf: much faster dictionary compression, by @felixhandte
> +perf: small quality improvement for dictionary generation, by @terrelln
> +perf: slightly improved high compression levels (notably level 19)
> +mem : automatic memory release for long duration contexts
> +cli : fix : overlapLog can be manually set
> +cli : fix : decoding invalid lz4 frames
> +api : fix : performance degradation for dictionary compression when using advanced API, by @terrelln
> +api : change : clarify ZSTD_CCtx_reset() vs ZSTD_CCtx_resetParameters(), by @terrelln
> +build: select custom libzstd scope through control macros, by @GeorgeLu97
> +build: OpenBSD patch, by @bket
> +build: make and make all are compatible with -j
> +doc : clarify zstd_compression_format.md, updated for IETF RFC process
> +misc: pzstd compatible with reproducible compilation, by @lamby
> +
> v1.3.4
> perf: faster speed (especially decoding speed) on recent cpus (haswell+)
> perf: much better performance associating --long with multi-threading, by @terrelln
>
> Modified: head/sys/contrib/zstd/README.md
> ==============================================================================
> --- head/sys/contrib/zstd/README.md Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/README.md Mon Oct 22 18:29:12 2018 (r339606)
> @@ -4,7 +4,7 @@ __Zstandard__, or `zstd` as short version, is a fast l
> targeting real-time compression scenarios at zlib-level and better compression ratios.
> It's backed by a very fast entropy stage, provided by [Huff0 and FSE library](https://github.com/Cyan4973/FiniteStateEntropy).
>
> -The project is provided as an open-source BSD-licensed **C** library,
> +The project is provided as an open-source dual [BSD](LICENSE) and [GPLv2](COPYING) licensed **C** library,
> and a command line utility producing and decoding `.zst`, `.gz`, `.xz` and `.lz4` files.
> Should your project require another programming language,
> a list of known ports and bindings is provided on [Zstandard homepage](http://www.zstd.net/#other-languages).
> @@ -120,6 +120,8 @@ Other available options include:
> A `cmake` project generator is provided within `build/cmake`.
> It can generate Makefiles or other build scripts
> to create `zstd` binary, and `libzstd` dynamic and static libraries.
> +
> +By default, `CMAKE_BUILD_TYPE` is set to `Release`.
>
> #### Meson
>
>
> Modified: head/sys/contrib/zstd/TESTING.md
> ==============================================================================
> --- head/sys/contrib/zstd/TESTING.md Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/TESTING.md Mon Oct 22 18:29:12 2018 (r339606)
> @@ -41,4 +41,4 @@ They consist of the following tests:
> - `pzstd` with asan and tsan, as well as in 32-bits mode
> - Testing `zstd` with legacy mode off
> - Testing `zbuff` (old streaming API)
> -- Entire test suite and make install on OS X
> +- Entire test suite and make install on macOS
>
> Modified: head/sys/contrib/zstd/appveyor.yml
> ==============================================================================
> --- head/sys/contrib/zstd/appveyor.yml Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/appveyor.yml Mon Oct 22 18:29:12 2018 (r339606)
> @@ -181,15 +181,15 @@
> - COMPILER: "gcc"
> HOST: "mingw"
> PLATFORM: "x64"
> - SCRIPT: "make allzstd"
> + SCRIPT: "CPPFLAGS=-DDEBUGLEVEL=2 CFLAGS=-Werror make -j allzstd DEBUGLEVEL=2"
> - COMPILER: "gcc"
> HOST: "mingw"
> PLATFORM: "x86"
> - SCRIPT: "make allzstd"
> + SCRIPT: "CFLAGS=-Werror make -j allzstd"
> - COMPILER: "clang"
> HOST: "mingw"
> PLATFORM: "x64"
> - SCRIPT: "MOREFLAGS='--target=x86_64-w64-mingw32 -Werror -Wconversion -Wno-sign-conversion' make allzstd"
> + SCRIPT: "CFLAGS='--target=x86_64-w64-mingw32 -Werror -Wconversion -Wno-sign-conversion' make -j allzstd"
>
> - COMPILER: "visual"
> HOST: "visual"
>
> Modified: head/sys/contrib/zstd/contrib/gen_html/Makefile
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/gen_html/Makefile Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/gen_html/Makefile Mon Oct 22 18:29:12 2018 (r339606)
> @@ -10,7 +10,7 @@
> CXXFLAGS ?= -O3
> CXXFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wno-comment
> CXXFLAGS += $(MOREFLAGS)
> -FLAGS = $(CPPFLAGS) $(CXXFLAGS) $(CXXFLAGS) $(LDFLAGS)
> +FLAGS = $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS)
>
> ZSTDAPI = ../../lib/zstd.h
> ZSTDMANUAL = ../../doc/zstd_manual.html
>
> Modified: head/sys/contrib/zstd/contrib/meson/meson.build
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/meson/meson.build Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/meson/meson.build Mon Oct 22 18:29:12 2018 (r339606)
> @@ -18,6 +18,7 @@ libzstd_srcs = [
> join_paths(common_dir, 'error_private.c'),
> join_paths(common_dir, 'xxhash.c'),
> join_paths(compress_dir, 'fse_compress.c'),
> + join_paths(compress_dir, 'hist.c'),
> join_paths(compress_dir, 'huf_compress.c'),
> join_paths(compress_dir, 'zstd_compress.c'),
> join_paths(compress_dir, 'zstd_fast.c'),
> @@ -130,6 +131,7 @@ test('fuzzer', fuzzer)
> if target_machine.system() != 'windows'
> paramgrill = executable('paramgrill',
> datagen_c, join_paths(tests_dir, 'paramgrill.c'),
> + join_paths(programs_dir, 'bench.c'),
> include_directories: test_includes,
> link_with: libzstd,
> dependencies: libm)
>
> Modified: head/sys/contrib/zstd/contrib/pzstd/Makefile
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/pzstd/Makefile Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/pzstd/Makefile Mon Oct 22 18:29:12 2018 (r339606)
> @@ -42,7 +42,7 @@ PZSTD_LDFLAGS =
> EXTRA_FLAGS =
> ALL_CFLAGS = $(EXTRA_FLAGS) $(CPPFLAGS) $(PZSTD_CPPFLAGS) $(CFLAGS) $(PZSTD_CFLAGS)
> ALL_CXXFLAGS = $(EXTRA_FLAGS) $(CPPFLAGS) $(PZSTD_CPPFLAGS) $(CXXFLAGS) $(PZSTD_CXXFLAGS)
> -ALL_LDFLAGS = $(EXTRA_FLAGS) $(LDFLAGS) $(PZSTD_LDFLAGS)
> +ALL_LDFLAGS = $(EXTRA_FLAGS) $(CXXFLAGS) $(LDFLAGS) $(PZSTD_LDFLAGS)
>
>
> # gtest libraries need to go before "-lpthread" because they depend on it.
> @@ -50,7 +50,7 @@ GTEST_LIB = -L googletest/build/googlemock/gtest
> LIBS =
>
> # Compilation commands
> -LD_COMMAND = $(CXX) $^ $(ALL_LDFLAGS) $(LIBS) -lpthread -o $@
> +LD_COMMAND = $(CXX) $^ $(ALL_LDFLAGS) $(LIBS) -pthread -o $@
> CC_COMMAND = $(CC) $(DEPFLAGS) $(ALL_CFLAGS) -c $< -o $@
> CXX_COMMAND = $(CXX) $(DEPFLAGS) $(ALL_CXXFLAGS) -c $< -o $@
>
>
> Modified: head/sys/contrib/zstd/contrib/pzstd/Options.cpp
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/pzstd/Options.cpp Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/pzstd/Options.cpp Mon Oct 22 18:29:12 2018 (r339606)
> @@ -18,17 +18,6 @@
> #include <thread>
> #include <vector>
>
> -#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(_WIN32) || \
> - defined(__CYGWIN__)
> -#include <io.h> /* _isatty */
> -#define IS_CONSOLE(stdStream) _isatty(_fileno(stdStream))
> -#elif defined(_POSIX_C_SOURCE) || defined(_XOPEN_SOURCE) || defined(_POSIX_SOURCE) || (defined(__APPLE__) && defined(__MACH__)) || \
> - defined(__DragonFly__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) /* https://sourceforge.net/p/predef/wiki/OperatingSystems/ */
> -#include <unistd.h> /* isatty */
> -#define IS_CONSOLE(stdStream) isatty(fileno(stdStream))
> -#else
> -#define IS_CONSOLE(stdStream) 0
> -#endif
>
> namespace pzstd {
>
> @@ -85,7 +74,7 @@ void usage() {
> std::fprintf(stderr, "Usage:\n");
> std::fprintf(stderr, " pzstd [args] [FILE(s)]\n");
> std::fprintf(stderr, "Parallel ZSTD options:\n");
> - std::fprintf(stderr, " -p, --processes # : number of threads to use for (de)compression (default:%d)\n", defaultNumThreads());
> + std::fprintf(stderr, " -p, --processes # : number of threads to use for (de)compression (default:<numcpus>)\n");
>
> std::fprintf(stderr, "ZSTD options:\n");
> std::fprintf(stderr, " -# : # compression level (1-%d, default:%d)\n", kMaxNonUltraCompressionLevel, kDefaultCompressionLevel);
>
> Modified: head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp Mon Oct 22 18:29:12 2018 (r339606)
> @@ -6,6 +6,7 @@
> * LICENSE file in the root directory of this source tree) and the GPLv2 (found
> * in the COPYING file in the root directory of this source tree).
> */
> +#include "platform.h" /* Large Files support, SET_BINARY_MODE */
> #include "Pzstd.h"
> #include "SkippableFrame.h"
> #include "utils/FileSystem.h"
> @@ -21,14 +22,6 @@
> #include <memory>
> #include <string>
>
> -#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(_WIN32) || defined(__CYGWIN__)
> -# include <fcntl.h> /* _O_BINARY */
> -# include <io.h> /* _setmode, _isatty */
> -# define SET_BINARY_MODE(file) { if (_setmode(_fileno(file), _O_BINARY) == -1) perror("Cannot set _O_BINARY"); }
> -#else
> -# include <unistd.h> /* isatty */
> -# define SET_BINARY_MODE(file)
> -#endif
>
> namespace pzstd {
>
>
> Modified: head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile Mon Oct 22 18:29:12 2018 (r339606)
> @@ -9,19 +9,25 @@
>
> # This Makefile presumes libzstd is built, using `make` in / or /lib/
>
> -LDFLAGS += ../../../lib/libzstd.a
> +ZSTDLIB_PATH = ../../../lib
> +ZSTDLIB_NAME = libzstd.a
> +ZSTDLIB = $(ZSTDLIB_PATH)/$(ZSTDLIB_NAME)
> +
> CPPFLAGS += -I../ -I../../../lib -I../../../lib/common
>
> CFLAGS ?= -O3
> CFLAGS += -g
>
> -SEEKABLE_OBJS = ../zstdseek_compress.c ../zstdseek_decompress.c
> +SEEKABLE_OBJS = ../zstdseek_compress.c ../zstdseek_decompress.c $(ZSTDLIB)
>
> .PHONY: default all clean test
>
> default: all
>
> all: seekable_compression seekable_decompression parallel_processing
> +
> +$(ZSTDLIB):
> + make -C $(ZSTDLIB_PATH) $(ZSTDLIB_NAME)
>
> seekable_compression : seekable_compression.c $(SEEKABLE_OBJS)
> $(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
>
> Modified: head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c Mon Oct 22 18:29:12 2018 (r339606)
> @@ -101,7 +101,7 @@ static void compressFile_orDie(const char* fname, cons
> free(buffOut);
> }
>
> -static const char* createOutFilename_orDie(const char* filename)
> +static char* createOutFilename_orDie(const char* filename)
> {
> size_t const inL = strlen(filename);
> size_t const outL = inL + 5;
> @@ -109,7 +109,7 @@ static const char* createOutFilename_orDie(const char*
> memset(outSpace, 0, outL);
> strcat(outSpace, filename);
> strcat(outSpace, ".zst");
> - return (const char*)outSpace;
> + return (char*)outSpace;
> }
>
> int main(int argc, const char** argv) {
> @@ -124,8 +124,9 @@ int main(int argc, const char** argv) {
> { const char* const inFileName = argv[1];
> unsigned const frameSize = (unsigned)atoi(argv[2]);
>
> - const char* const outFileName = createOutFilename_orDie(inFileName);
> + char* const outFileName = createOutFilename_orDie(inFileName);
> compressFile_orDie(inFileName, outFileName, 5, frameSize);
> + free(outFileName);
> }
>
> return 0;
>
> Modified: head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c Mon Oct 22 18:29:12 2018 (r339606)
> @@ -84,7 +84,7 @@ static void fseek_orDie(FILE* file, long int offset, i
> }
>
>
> -static void decompressFile_orDie(const char* fname, unsigned startOffset, unsigned endOffset)
> +static void decompressFile_orDie(const char* fname, off_t startOffset, off_t endOffset)
> {
> FILE* const fin = fopen_orDie(fname, "rb");
> FILE* const fout = stdout;
> @@ -129,8 +129,8 @@ int main(int argc, const char** argv)
>
> {
> const char* const inFilename = argv[1];
> - unsigned const startOffset = (unsigned) atoi(argv[2]);
> - unsigned const endOffset = (unsigned) atoi(argv[3]);
> + off_t const startOffset = atoll(argv[2]);
> + off_t const endOffset = atoll(argv[3]);
> decompressFile_orDie(inFilename, startOffset, endOffset);
> }
>
>
> Modified: head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h Mon Oct 22 18:29:12 2018 (r339606)
> @@ -6,8 +6,10 @@ extern "C" {
> #endif
>
> #include <stdio.h>
> +#include "zstd.h" /* ZSTDLIB_API */
>
> -static const unsigned ZSTD_seekTableFooterSize = 9;
> +
> +#define ZSTD_seekTableFooterSize 9
>
> #define ZSTD_SEEKABLE_MAGICNUMBER 0x8F92EAB1
>
>
> Modified: head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c
> ==============================================================================
> --- head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c Mon Oct 22 18:29:12 2018 (r339606)
> @@ -24,7 +24,7 @@
> #endif
>
> /* ************************************************************
> -* Avoid fseek()'s 2GiB barrier with MSVC, MacOS, *BSD, MinGW
> +* Avoid fseek()'s 2GiB barrier with MSVC, macOS, *BSD, MinGW
> ***************************************************************/
> #if defined(_MSC_VER) && _MSC_VER >= 1400
> # define LONG_SEEK _fseeki64
> @@ -56,6 +56,7 @@
>
> #include <stdlib.h> /* malloc, free */
> #include <stdio.h> /* FILE* */
> +#include <assert.h>
>
> #define XXH_STATIC_LINKING_ONLY
> #define XXH_NAMESPACE ZSTD_
> @@ -88,7 +89,7 @@ static int ZSTD_seekable_read_FILE(void* opaque, void*
> return 0;
> }
>
> -static int ZSTD_seekable_seek_FILE(void* opaque, S64 offset, int origin)
> +static int ZSTD_seekable_seek_FILE(void* opaque, long long offset, int origin)
> {
> int const ret = LONG_SEEK((FILE*)opaque, offset, origin);
> if (ret) return ret;
> @@ -110,9 +111,9 @@ static int ZSTD_seekable_read_buff(void* opaque, void*
> return 0;
> }
>
> -static int ZSTD_seekable_seek_buff(void* opaque, S64 offset, int origin)
> +static int ZSTD_seekable_seek_buff(void* opaque, long long offset, int origin)
> {
> - buffWrapper_t* buff = (buffWrapper_t*) opaque;
> + buffWrapper_t* const buff = (buffWrapper_t*) opaque;
> unsigned long long newOffset;
> switch (origin) {
> case SEEK_SET:
> @@ -124,6 +125,8 @@ static int ZSTD_seekable_seek_buff(void* opaque, S64 o
> case SEEK_END:
> newOffset = (unsigned long long)buff->size - offset;
> break;
> + default:
> + assert(0); /* not possible */
> }
> if (newOffset > buff->size) {
> return -1;
> @@ -197,7 +200,7 @@ size_t ZSTD_seekable_free(ZSTD_seekable* zs)
> * Performs a binary search to find the last frame with a decompressed offset
> * <= pos
> * @return : the frame's index */
> -U32 ZSTD_seekable_offsetToFrameIndex(ZSTD_seekable* const zs, U64 pos)
> +U32 ZSTD_seekable_offsetToFrameIndex(ZSTD_seekable* const zs, unsigned long long pos)
> {
> U32 lo = 0;
> U32 hi = zs->seekTable.tableLen;
> @@ -222,13 +225,13 @@ U32 ZSTD_seekable_getNumFrames(ZSTD_seekable* const zs
> return zs->seekTable.tableLen;
> }
>
> -U64 ZSTD_seekable_getFrameCompressedOffset(ZSTD_seekable* const zs, U32 frameIndex)
> +unsigned long long ZSTD_seekable_getFrameCompressedOffset(ZSTD_seekable* const zs, U32 frameIndex)
> {
> if (frameIndex >= zs->seekTable.tableLen) return ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE;
> return zs->seekTable.entries[frameIndex].cOffset;
> }
>
> -U64 ZSTD_seekable_getFrameDecompressedOffset(ZSTD_seekable* const zs, U32 frameIndex)
> +unsigned long long ZSTD_seekable_getFrameDecompressedOffset(ZSTD_seekable* const zs, U32 frameIndex)
> {
> if (frameIndex >= zs->seekTable.tableLen) return ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE;
> return zs->seekTable.entries[frameIndex].dOffset;
> @@ -294,7 +297,6 @@ static size_t ZSTD_seekable_loadSeekTable(ZSTD_seekabl
> { /* Allocate an extra entry at the end so that we can do size
> * computations on the last element without special case */
> seekEntry_t* entries = (seekEntry_t*)malloc(sizeof(seekEntry_t) * (numFrames + 1));
> - const BYTE* tableBase = zs->inBuff + ZSTD_skippableHeaderSize;
>
> U32 idx = 0;
> U32 pos = 8;
> @@ -311,8 +313,8 @@ static size_t ZSTD_seekable_loadSeekTable(ZSTD_seekabl
> /* compute cumulative positions */
> for (; idx < numFrames; idx++) {
> if (pos + sizePerEntry > SEEKABLE_BUFF_SIZE) {
> - U32 const toRead = MIN(remaining, SEEKABLE_BUFF_SIZE);
> U32 const offset = SEEKABLE_BUFF_SIZE - pos;
> + U32 const toRead = MIN(remaining, SEEKABLE_BUFF_SIZE - offset);
> memmove(zs->inBuff, zs->inBuff + pos, offset); /* move any data we haven't read yet */
> CHECK_IO(src.read(src.opaque, zs->inBuff+offset, toRead));
> remaining -= toRead;
> @@ -372,7 +374,7 @@ size_t ZSTD_seekable_initAdvanced(ZSTD_seekable* zs, Z
> return 0;
> }
>
> -size_t ZSTD_seekable_decompress(ZSTD_seekable* zs, void* dst, size_t len, U64 offset)
> +size_t ZSTD_seekable_decompress(ZSTD_seekable* zs, void* dst, size_t len, unsigned long long offset)
> {
> U32 targetFrame = ZSTD_seekable_offsetToFrameIndex(zs, offset);
> do {
>
> Added: head/sys/contrib/zstd/doc/images/cdict_v136.png
> ==============================================================================
> Binary file. No diff available.
>
> Added: head/sys/contrib/zstd/doc/images/zstd_cdict_v1_3_5.png
> ==============================================================================
> Binary file. No diff available.
>
> Modified: head/sys/contrib/zstd/doc/zstd_compression_format.md
> ==============================================================================
> --- head/sys/contrib/zstd/doc/zstd_compression_format.md Mon Oct 22 17:42:57 2018 (r339605)
> +++ head/sys/contrib/zstd/doc/zstd_compression_format.md Mon Oct 22 18:29:12 2018 (r339606)
> @@ -16,7 +16,7 @@ Distribution of this document is unlimited.
>
> ### Version
>
> -0.2.6 (19/08/17)
> +0.3.0 (25/09/18)
>
>
> Introduction
> @@ -27,6 +27,8 @@ that is independent of CPU type, operating system,
> file system and character set, suitable for
> file compression, pipe and streaming compression,
> using the [Zstandard algorithm](http://www.zstandard.org).
> +The text of the specification assumes a basic background in programming
> +at the level of bits and other primitive data representations.
>
> The data can be produced or consumed,
> even for an arbitrarily long sequentially presented input data stream,
> @@ -39,11 +41,6 @@ for detection of data corruption.
> The data format defined by this specification
> does not attempt to allow random access to compressed data.
>
> -This specification is intended for use by implementers of software
> -to compress data into Zstandard format and/or decompress data from Zstandard format.
> -The text of the specification assumes a basic background in programming
> -at the level of bits and other primitive data representations.
> -
> Unless otherwise indicated below,
> a compliant compressor must produce data sets
> that conform to the specifications presented here.
> @@ -57,6 +54,12 @@ Whenever it does not support a parameter defined in th
> it must produce a non-ambiguous error code and associated error message
> explaining which parameter is unsupported.
>
> +This specification is intended for use by implementers of software
> +to compress data into Zstandard format and/or decompress data from Zstandard format.
> +The Zstandard format is supported by an open source reference implementation,
> +written in portable C, and available at : https://github.com/facebook/zstd .
> +
> +
> ### Overall conventions
> In this document:
> - square brackets i.e. `[` and `]` are used to indicate optional fields or parameters.
> @@ -69,7 +72,7 @@ A frame is completely independent, has a defined begin
> and a set of parameters which tells the decoder how to decompress it.
>
> A frame encapsulates one or multiple __blocks__.
> -Each block can be compressed or not,
> +Each block contains arbitrary content, which is described by its header,
> and has a guaranteed maximum content size, which depends on frame parameters.
> Unlike frames, each block depends on previous blocks for proper decoding.
> However, each block can be decompressed without waiting for its successor,
> @@ -92,14 +95,14 @@ Overview
> Frames
> ------
> Zstandard compressed data is made of one or more __frames__.
> -Each frame is independent and can be decompressed indepedently of other frames.
> +Each frame is independent and can be decompressed independently of other frames.
> The decompressed content of multiple concatenated frames is the concatenation of
> each frame decompressed content.
>
> There are two frame formats defined by Zstandard:
> Zstandard frames and Skippable frames.
> Zstandard frames contain compressed data, while
> -skippable frames contain no data and can be used for metadata.
> +skippable frames contain custom user metadata.
>
> ## Zstandard frames
> The structure of a single Zstandard frame is following:
> @@ -112,6 +115,11 @@ __`Magic_Number`__
>
> 4 Bytes, __little-endian__ format.
> Value : 0xFD2FB528
> +Note: This value was selected to be less probable to find at the beginning of some random file.
> +It avoids trivial patterns (0x00, 0xFF, repeated bytes, increasing bytes, etc.),
> +contains byte values outside of ASCII range,
> +and doesn't map into UTF8 space.
> +It reduces the chances that a text file represent this value by accident.
>
> __`Frame_Header`__
>
> @@ -171,8 +179,8 @@ according to the following table:
> |`FCS_Field_Size`| 0 or 1 | 2 | 4 | 8 |
>
> When `Flag_Value` is `0`, `FCS_Field_Size` depends on `Single_Segment_flag` :
> -if `Single_Segment_flag` is set, `Field_Size` is 1.
> -Otherwise, `Field_Size` is 0 : `Frame_Content_Size` is not provided.
> +if `Single_Segment_flag` is set, `FCS_Field_Size` is 1.
> +Otherwise, `FCS_Field_Size` is 0 : `Frame_Content_Size` is not provided.
>
> __`Single_Segment_flag`__
>
> @@ -196,10 +204,10 @@ depending on local limitations.
>
> __`Unused_bit`__
>
> -The value of this bit should be set to zero.
> -A decoder compliant with this specification version shall not interpret it.
> -It might be used in a future version,
> -to signal a property which is not mandatory to properly decode the frame.
> +A decoder compliant with this specification version shall not interpret this bit.
> +It might be used in any future version,
> +to signal a property which is transparent to properly decode the frame.
> +An encoder compliant with this specification version must set this bit to zero.
>
> __`Reserved_bit`__
>
> @@ -218,11 +226,11 @@ __`Dictionary_ID_flag`__
>
> This is a 2-bits flag (`= FHD & 3`),
> telling if a dictionary ID is provided within the header.
> -It also specifies the size of this field as `Field_Size`.
> +It also specifies the size of this field as `DID_Field_Size`.
>
> -|`Flag_Value`| 0 | 1 | 2 | 3 |
> -| ---------- | --- | --- | --- | --- |
> -|`Field_Size`| 0 | 1 | 2 | 4 |
> +|`Flag_Value` | 0 | 1 | 2 | 3 |
> +| -------------- | --- | --- | --- | --- |
> +|`DID_Field_Size`| 0 | 1 | 2 | 4 |
>
> #### `Window_Descriptor`
>
> @@ -249,6 +257,9 @@ Window_Size = windowBase + windowAdd;
> The minimum `Window_Size` is 1 KB.
> The maximum `Window_Size` is `(1<<41) + 7*(1<<38)` bytes, which is 3.75 TB.
>
> +In general, larger `Window_Size` tend to improve compression ratio,
> +but at the cost of memory usage.
> +
> To properly decode compressed data,
> a decoder will need to allocate a buffer of at least `Window_Size` bytes.
>
> @@ -257,8 +268,8 @@ a decoder is allowed to reject a compressed frame
> which requests a memory size beyond decoder's authorized range.
>
> For improved interoperability,
> -decoders are recommended to be compatible with `Window_Size <= 8 MB`,
> -and encoders are recommended to not request more than 8 MB.
> +it's recommended for decoders to support `Window_Size` of up to 8 MB,
> +and it's recommended for encoders to not generate frame requiring `Window_Size` larger than 8 MB.
> It's merely a recommendation though,
> decoders are free to support larger or lower limits,
> depending on local limitations.
> @@ -268,9 +279,10 @@ depending on local limitations.
> This is a variable size field, which contains
> the ID of the dictionary required to properly decode the frame.
> `Dictionary_ID` field is optional. When it's not present,
> -it's up to the decoder to make sure it uses the correct dictionary.
> +it's up to the decoder to know which dictionary to use.
>
> -Field size depends on `Dictionary_ID_flag`.
> +`Dictionary_ID` field size is provided by `DID_Field_Size`.
> +`DID_Field_Size` is directly derived from value of `Dictionary_ID_flag`.
> 1 byte can represent an ID 0-255.
> 2 bytes can represent an ID 0-65535.
> 4 bytes can represent an ID 0-4294967295.
> @@ -280,13 +292,21 @@ It's allowed to represent a small ID (for example `13`
> with a large 4-bytes dictionary ID, even if it is less efficient.
>
> _Reserved ranges :_
> -If the frame is going to be distributed in a private environment,
> -any dictionary ID can be used.
> -However, for public distribution of compressed frames using a dictionary,
> -the following ranges are reserved and shall not be used :
> +Within private environments, any `Dictionary_ID` can be used.
> +
> +However, for frames and dictionaries distributed in public space,
> +`Dictionary_ID` must be attributed carefully.
> +Rules for public environment are not yet decided,
> +but the following ranges are reserved for some future registrar :
> - low range : `<= 32767`
> - high range : `>= (1 << 31)`
>
> +Outside of these ranges, any value of `Dictionary_ID`
> +which is both `>= 32768` and `< (1<<31)` can be used freely,
> +even in public environment.
> +
> +
> +
> #### `Frame_Content_Size`
>
> This is the original (uncompressed) size. This information is optional.
> @@ -359,22 +379,23 @@ There are 4 block types :
>
> - `Reserved` - this is not a block.
> This value cannot be used with current version of this specification.
> + If such a value is present, it is considered corrupted data.
>
> __`Block_Size`__
>
> The upper 21 bits of `Block_Header` represent the `Block_Size`.
> +`Block_Size` is the size of the block excluding the header.
> +A block can contain any number of bytes (even zero), up to
> +`Block_Maximum_Decompressed_Size`, which is the smallest of:
> +- Window_Size
> +- 128 KB
>
> -Block sizes must respect a few rules :
> -- For `Compressed_Block`, `Block_Size` is always strictly less than decompressed size.
> -- Block decompressed size is always <= `Window_Size`
> -- Block decompressed size is always <= 128 KB.
> +A `Compressed_Block` has the extra restriction that `Block_Size` is always
> +strictly less than the decompressed size.
> +If this condition cannot be respected,
> +the block must be sent uncompressed instead (`Raw_Block`).
>
> -A block can contain any number of bytes (even empty),
> -up to `Block_Maximum_Decompressed_Size`, which is the smallest of :
> -- `Window_Size`
> -- 128 KB
>
> -
> Compressed Blocks
> -----------------
> To decompress a compressed block, the compressed size must be provided
> @@ -390,11 +411,17 @@ data in [Sequence Execution](#sequence-execution)
> #### Prerequisites
> To decode a compressed block, the following elements are necessary :
> - Previous decoded data, up to a distance of `Window_Size`,
> - or all previously decoded data when `Single_Segment_flag` is set.
> + or beginning of the Frame, whichever is smaller.
> - List of "recent offsets" from previous `Compressed_Block`.
> -- Decoding tables of previous `Compressed_Block` for each symbol type
> - (literals, literals lengths, match lengths, offsets).
> +- The previous Huffman tree, required by `Treeless_Literals_Block` type
> +- Previous FSE decoding tables, required by `Repeat_Mode`
> + for each symbol type (literals lengths, match lengths, offsets)
>
> +Note that decoding tables aren't always from the previous `Compressed_Block`.
> +
> +- Every decoding table can come from a dictionary.
> +- The Huffman tree comes from the previous `Compressed_Literals_Block`.
> +
> Literals Section
> ----------------
> All literals are regrouped in the first part of the block.
> @@ -405,11 +432,11 @@ Literals can be stored uncompressed or compressed usin
> When compressed, an optional tree description can be present,
> followed by 1 or 4 streams.
>
> -| `Literals_Section_Header` | [`Huffman_Tree_Description`] | Stream1 | [Stream2] | [Stream3] | [Stream4] |
> -| ------------------------- | ---------------------------- | ------- | --------- | --------- | --------- |
> +| `Literals_Section_Header` | [`Huffman_Tree_Description`] | [jumpTable] | Stream1 | [Stream2] | [Stream3] | [Stream4] |
> +| ------------------------- | ---------------------------- | ----------- | ------- | --------- | --------- | --------- |
>
>
> -#### `Literals_Section_Header`
> +### `Literals_Section_Header`
>
> Header is in charge of describing how literals are packed.
> It's a byte-aligned variable-size bitfield, ranging from 1 to 5 bytes,
> @@ -460,18 +487,21 @@ For values spanning several bytes, convention is __lit
>
> __`Size_Format` for `Raw_Literals_Block` and `RLE_Literals_Block`__ :
>
> -- Value ?0 : `Size_Format` uses 1 bit.
> +`Size_Format` uses 1 _or_ 2 bits.
> +Its value is : `Size_Format = (Literals_Section_Header[0]>>2) & 3`
> +
> +- `Size_Format` == 00 or 10 : `Size_Format` uses 1 bit.
> `Regenerated_Size` uses 5 bits (0-31).
> - `Literals_Section_Header` has 1 byte.
> - `Regenerated_Size = Header[0]>>3`
> -- Value 01 : `Size_Format` uses 2 bits.
> + `Literals_Section_Header` uses 1 byte.
> + `Regenerated_Size = Literals_Section_Header[0]>>3`
> +- `Size_Format` == 01 : `Size_Format` uses 2 bits.
> `Regenerated_Size` uses 12 bits (0-4095).
> - `Literals_Section_Header` has 2 bytes.
> - `Regenerated_Size = (Header[0]>>4) + (Header[1]<<4)`
> -- Value 11 : `Size_Format` uses 2 bits.
> + `Literals_Section_Header` uses 2 bytes.
> + `Regenerated_Size = (Literals_Section_Header[0]>>4) + (Literals_Section_Header[1]<<4)`
> +- `Size_Format` == 11 : `Size_Format` uses 2 bits.
> `Regenerated_Size` uses 20 bits (0-1048575).
> - `Literals_Section_Header` has 3 bytes.
> - `Regenerated_Size = (Header[0]>>4) + (Header[1]<<4) + (Header[2]<<12)`
> + `Literals_Section_Header` uses 3 bytes.
> + `Regenerated_Size = (Literals_Section_Header[0]>>4) + (Literals_Section_Header[1]<<4) + (Literals_Section_Header[2]<<12)`
>
> Only Stream1 is present for these cases.
> Note : it's allowed to represent a short value (for example `13`)
> @@ -479,66 +509,74 @@ using a long format, even if it's less efficient.
>
> __`Size_Format` for `Compressed_Literals_Block` and `Treeless_Literals_Block`__ :
>
> -- Value 00 : _A single stream_.
> +`Size_Format` always uses 2 bits.
> +
> +- `Size_Format` == 00 : _A single stream_.
> Both `Regenerated_Size` and `Compressed_Size` use 10 bits (0-1023).
> - `Literals_Section_Header` has 3 bytes.
> -- Value 01 : 4 streams.
> + `Literals_Section_Header` uses 3 bytes.
> +- `Size_Format` == 01 : 4 streams.
> Both `Regenerated_Size` and `Compressed_Size` use 10 bits (0-1023).
> - `Literals_Section_Header` has 3 bytes.
> -- Value 10 : 4 streams.
> + `Literals_Section_Header` uses 3 bytes.
> +- `Size_Format` == 10 : 4 streams.
> Both `Regenerated_Size` and `Compressed_Size` use 14 bits (0-16383).
> - `Literals_Section_Header` has 4 bytes.
> -- Value 11 : 4 streams.
> + `Literals_Section_Header` uses 4 bytes.
> +- `Size_Format` == 11 : 4 streams.
> Both `Regenerated_Size` and `Compressed_Size` use 18 bits (0-262143).
> - `Literals_Section_Header` has 5 bytes.
> + `Literals_Section_Header` uses 5 bytes.
>
> Both `Compressed_Size` and `Regenerated_Size` fields follow __little-endian__ convention.
> Note: `Compressed_Size` __includes__ the size of the Huffman Tree description
> _when_ it is present.
>
> -### Raw Literals Block
> +#### Raw Literals Block
> The data in Stream1 is `Regenerated_Size` bytes long,
> it contains the raw literals data to be used during [Sequence Execution].
>
> -### RLE Literals Block
> +#### RLE Literals Block
> Stream1 consists of a single byte which should be repeated `Regenerated_Size` times
> to generate the decoded literals.
>
> -### Compressed Literals Block and Treeless Literals Block
> +#### Compressed Literals Block and Treeless Literals Block
> Both of these modes contain Huffman encoded data.
> -`Treeless_Literals_Block` does not have a `Huffman_Tree_Description`.
>
> -#### `Huffman_Tree_Description`
> +For `Treeless_Literals_Block`,
> +the Huffman table comes from previously compressed literals block,
> +or from a dictionary.
> +
> +
> +### `Huffman_Tree_Description`
> This section is only present when `Literals_Block_Type` type is `Compressed_Literals_Block` (`2`).
> The format of the Huffman tree description can be found at [Huffman Tree description](#huffman-tree-description).
> The size of `Huffman_Tree_Description` is determined during decoding process,
> it must be used to determine where streams begin.
> `Total_Streams_Size = Compressed_Size - Huffman_Tree_Description_Size`.
>
> -For `Treeless_Literals_Block`,
>
> *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
>
>
--
Rod Grimes rgrimes at freebsd.org
More information about the svn-src-all
mailing list