Zstandard - Fast real-time compression algorithm

push eventbinhdvo/zstd

commit sha 7f3005ee18ef1e2eb6223833d97f49b01b10565e

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 17 days

push eventbinhdvo/zstd

commit sha 781bda349b6d257541817eb9eeefd61e70b6b2b9

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 17 days

push eventbinhdvo/zstd

commit sha 5ee20d994ca250b40d1672e9c7c8e7910488200e

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 17 days

push eventbinhdvo/zstd

commit sha eee97d6a67c0833b61246fe124c7d5a8b6e3b760

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 17 days

push eventbinhdvo/zstd

commit sha 5c7bf29be523fc46f190bbf61e7091854aab8148

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 18 days

push eventbinhdvo/zstd

commit sha 682e281992bf7f23360fa89d4eab6bedee68a665

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 18 days

push eventbinhdvo/zstd

commit sha 5ba06f5e4b4b003d5d7f69edd32745a1f44ec25a

Reduce ZSTD_DCtx size by reusing dst buffer

push time in 18 days

push eventbinhdvo/zstd

commit sha 423cf98927fbcd0d67c3c64b26e95c1b5f65a0e4

Reduce ZSTD_DCtx size by reusing dst buffer

push time in 18 days

push eventbinhdvo/zstd

commit sha e962cbc1a8fd4604ec0a8edbf324a81826e28177

Reduce ZSTD_DCtx size by reusing dst buffer

push time in 18 days

push eventbinhdvo/zstd

commit sha b19ba4943f4537674026d927e0d6447ca5081c3d

Reduce size of ZSTD_DCtx by tapping dst during decompression to reduce literal buffer

push time in 19 days

PR closed binhdvo/zstd

…ressibility is suspected

pr closed time in 19 days

push eventfacebook/zstd

commit sha dc5b693f1ef6f2aa297c615fe3567ad39030bf46

Proactively skip huffman compression based on sampling where non-compressibility is suspected

commit sha b3e372c171cff55128352e539dd3a3657f202606

Merge pull request #2717 from binhdvo/bootcamp Proactively skip huffman compression based on sampling where non-comp…

push time in 24 days

PR merged facebook/zstd

When a large block is suspected to be incompressible (based on ratio of literals to sequences), we evaluate whether or not huffman should be applied based on a smaller sampling of 4k blocks at the start and end of the buffer to proactively skip having to construct a histogram for the full data range. Benchmarks show improvements on low-compressibility samples:

Benchmarked on macos without PR:

binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P0 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 1628.7 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P1 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 1596.6 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P5 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 850.4 MB/s (79994983)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P10 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 663.1 MB/s (74066054)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P50 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 564.4 MB/s (31792743)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P100 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 7765.7 MB/s ( 3570)

With PR:

binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P0 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 3322.8 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P1 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 3373.9 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P1 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 3323.8 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P5 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 853.9 MB/s (79994983)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P10 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 676.9 MB/s (74066054)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P50 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 540.4 MB/s (31792743)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P100 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 7838.4 MB/s ( 3570)

pr closed time in 24 days

push eventbinhdvo/zstd

commit sha dc5b693f1ef6f2aa297c615fe3567ad39030bf46

Proactively skip huffman compression based on sampling where non-compressibility is suspected

push time in 25 days

Pull request review commentfacebook/zstd

Proactively skip huffman compression based on sampling where non-comp…

size_t HUF_compress4X_repeat(void* dst, size_t dstSize, void* workSpace, size_t wkspSize, /**< `workSpace` must be aligned on 4-bytes boundaries, `wkspSize` must be >= HUF_WORKSPACE_SIZE */ HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2); +/* Incorporate fast check(s) on the suspicion that this data is not compressible and back off+ quickly to raw format if so. */+size_t HUF_compress4X_repeat_fastCheck(void* dst, size_t dstSize,+ const void* src, size_t srcSize,+ unsigned maxSymbolValue, unsigned tableLog,+ void* workSpace, size_t wkspSize, /**< `workSpace` must be aligned on 4-bytes boundaries, `wkspSize` must be >= HUF_WORKSPACE_SIZE */+ HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2, unsigned suspectUncompressible);

After some thought, added comments rather than an enum; this is really just a stand-in for a boolean value so I think the argument name is sufficient to convey the meaning as far as the function definitions are concerned..

comment created time in 25 days

PR opened facebook/zstd

When a large block is suspected to be incompressible (based on ratio of literals to sequences), we evaluate whether or not huffman should be applied based on a smaller sampling of 4k blocks at the start and end of the buffer to proactively skip having to construct a histogram for the full data range. Benchmarks show improvements on low-compressibility samples:

Benchmarked on macos without PR:

binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P0 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 1628.7 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P1 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 1596.6 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P5 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 850.4 MB/s (79994983)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P10 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 663.1 MB/s (74066054)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P50 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 564.4 MB/s (31792743)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P100 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 7 2021) ***
Sample 100000000 bytes :

1#compress : 7765.7 MB/s ( 3570)

With PR:

binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P0 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 3322.8 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P1 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 3373.9 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P1 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 3323.8 MB/s (100002299)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P5 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 853.9 MB/s (79994983)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P10 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 676.9 MB/s (74066054)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P50 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 540.4 MB/s (31792743)
binhvo@binhvo-mbp zstd % ./tests/fullbench -b1 -P100 -B100000000
*** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Jun 25 2021) ***
Sample 100000000 bytes :

1#compress : 7838.4 MB/s ( 3570)

pr created time in a month

push eventbinhdvo/zstd

commit sha bc1dab45963eceabcec2fe8083df446145b6ce08

Proactively skip huffman compression based on sampling where non-compressibility is suspected

push time in a month

push eventbinhdvo/zstd

commit sha 75105ec5a4b19749b3dfcfb569f3aea69881468e

Proactively skip huffman compression based on sampling where non-compressibility is suspected

push time in a month

push eventbinhdvo/zstd

commit sha 36615531428c183a66613ae727a1cb8d41113972

Proactively skip huffman compression based on sampling where non-compressibility is suspected

push time in a month

push eventbinhdvo/zstd

commit sha 9729e56b5704d835446c4057f31e3af66f563082

Proactively skip huffman compression based on sampling where non-compressibility is suspected

push time in a month

PR opened binhdvo/zstd

…ressibility is suspected

pr created time in a month

push eventbinhdvo/zstd

commit sha 0f1c4cda5d523e2c39e217f118bdf4733b69b44d

Proactively skip huffman compression based on sampling where non-compressibility is suspected

push time in a month

PR closed facebook/zstd

Limits max --ultra compression level in 32 bit systems to 21 to avoid issues with contiguous memory allocation. Also updates tests, documentation, and warning message accordingly.

pr closed time in a month

push eventbinhdvo/zstd

push time in a month