Results of a ZSTD compression benchmark and step-by-step guide on running a benchmark on Linux based architecture
Zstandard (zstd), developed by Facebook, is a modern, fast compression algorithm that balances high compression ratios with fast decompression speeds. To assess zstd’s performance, a common benchmark file used is enwik8—a 100MB snapshot of Wikipedia content often used in compression research. By running zstd against enwik8 at various compression levels, one can measure compression ratio, speed (in MB/s), and resource usage to understand how zstd performs under different configurations and workloads.
The enwik8 test is a standard benchmark used in evaluating compression algorithms. It consists of the first 100 million bytes (100MB) of the English Wikipedia XML dump and is part of the Large Text Compression Benchmark (LTCB). This dataset is widely used because:
It is real-world, highly structured, and multilingual text data.
It contains a mix of common words, markup, metadata, and varying sentence structures.
It provides a meaningful test for how well algorithms handle textual redundancy, dictionary effectiveness, and speed vs. compression trade-offs.
Because enwik8 is standardized and publicly available, it allows for fair comparisons between different compression algorithms (e.g., Zstandard, gzip, bzip2, LZMA). In benchmarking, it's typically compressed at various algorithm levels (e.g., zstd -b1 -e22 enwik8) to observe:
Compression ratio (compressed size / original size)
Compression and decompression speed
CPU usage and memory requirements
This makes it a practical, repeatable benchmark for assessing how well an algorithm like zstd performs on structured, large-scale text data.
Results of zstd test. Note that results may vary based on system architecture:
Columns: (Compression Level#File, Original File Size, Compressed File Size (Compression Ratio), Compression Speed/second, Decompression Speed/second)
zstd -b1 -e22 enwik8
1#enwik8 : 100000000 -> 40738526 (2.455), 155.2 MB/s , 523.5 MB/s
2#enwik8 : 100000000 -> 37434671 (2.671), 122.4 MB/s , 362.2 MB/s
3#enwik8 : 100000000 -> 35602698 (2.809), 94.8 MB/s , 394.5 MB/s
4#enwik8 : 100000000 -> 34920953 (2.864), 77.5 MB/s , 344.1 MB/s
5#enwik8 : 100000000 -> 34315441 (2.914), 23.9 MB/s , 273.9 MB/s
6#enwik8 : 100000000 -> 33591155 (2.977), 22.9 MB/s , 348.5 MB/s
7#enwik8 : 100000000 -> 32443027 (3.082), 16.0 MB/s , 384.5 MB/s
8#enwik8 : 100000000 -> 32029317 (3.122), 8.20 MB/s , 398.8 MB/s
9#enwik8 : 100000000 -> 31751130 (3.149), 10.22 MB/s , 404.2 MB/s
10#enwik8 : 100000000 -> 31238385 (3.201), 4.91 MB/s , 378.2 MB/s
11#enwik8 : 100000000 -> 30968326 (3.229), 5.77 MB/s , 354.5 MB/s
12#enwik8 : 100000000 -> 30728213 (3.254), 2.20 MB/s , 198.6 MB/s
13#enwik8 : 100000000 -> 30331609 (3.297), 2.66 MB/s , 369.9 MB/s
14#enwik8 : 100000000 -> 29794982 (3.356), 1.78 MB/s , 336.8 MB/s
15#enwik8 : 100000000 -> 29436073 (3.397), 1.57 MB/s , 314.4 MB/s
16#enwik8 : 100000000 -> 28446554 (3.515), 1.29 MB/s , 250.8 MB/s
17#enwik8 : 100000000 -> 27702787 (3.610), 0.99 MB/s , 205.9 MB/s
18#enwik8 : 100000000 -> 27326331 (3.659), 0.85 MB/s , 280.4 MB/s
19#enwik8 : 100000000 -> 26957405 (3.710), 0.78 MB/s , 162.7 MB/s
20#enwik8 : 100000000 -> 25989368 (3.848), 0.62 MB/s , 297.9 MB/s
21#enwik8 : 100000000 -> 25541097 (3.915), 0.50 MB/s , 309.7 MB/s
22#enwik8 : 100000000 -> 25340452 (3.946), 0.46 MB/s , 221.3 MB/s
sudo apt update
sudo apt install zstd
sudo dnf install zstd
# or for older systems:
sudo yum install zstd
wget http://mattmahoney.net/dc/enwik8.zip
unzip enwik8.zip
zstd -b1 -e22 enwik8
-b1 -e22 tests compression levels 1 to 22https://facebook.github.io/zstd/
https://github.com/facebook/zstd