Lean 4 bindings for zlib and Zstandard compression, plus tar and ZIP archive support.
Provides whole-buffer and streaming APIs for zlib, gzip, raw deflate, and Zstandard formats, CRC32/Adler32 checksums, tar archives (.tar and .tar.gz with PAX/GNU extension support), and ZIP archives (with ZIP64 support).
- Lean 4 (tested with v4.20.0 through v4.29.0-rc1)
- zlib development headers (
zlib-dev,zlib1g-dev, or equivalent) - libzstd development headers (
libzstd-devor equivalent) pkg-config(for header discovery on NixOS and similar)
On NixOS (or any system where zlib/zstd aren't in the default library path),
the project includes a shell.nix that provides all C dependencies:
nix-shell # then run lake build, lake exe test, etc. inside the shellOr use direnv for automatic environment activation:
direnv allow # one-time setup; environment then activates on cdYou can also set ZLIB_CFLAGS and ZSTD_CFLAGS manually to point at the
headers if you prefer.
Add to your lakefile.lean:
require "kim-em" / "lean-zip"import Zip
-- Zlib format
let compressed ← Zlib.compress data
let original ← Zlib.decompress compressed
-- Gzip format (compatible with gzip/gunzip)
let gzipped ← Gzip.compress data (level := 6)
let original ← Gzip.decompress gzipped
-- Raw deflate (no header/trailer, used internally by ZIP)
let deflated ← RawDeflate.compress data
let original ← RawDeflate.decompress deflated
-- Zstandard (modern, fast, excellent compression ratio)
let compressed ← Zstd.compress data (level := 3)
let original ← Zstd.decompress compressedFor data too large to fit in memory:
-- Stream between IO.FS.Streams (64KB chunks, bounded memory)
Gzip.compressStream inputStream outputStream (level := 6)
Gzip.decompressStream inputStream outputStream
-- Zstd streaming
Zstd.compressStream inputStream outputStream (level := 3)
Zstd.decompressStream inputStream outputStream
-- File helpers
let gzPath ← Gzip.compressFile "/path/to/file" -- writes /path/to/file.gz
let outPath ← Gzip.decompressFile "/path/to/file.gz" -- writes /path/to/file
Zstd.compressFile "/path/to/file" -- writes /path/to/file.zst
Zstd.decompressFile "/path/to/file.zst" -- writes /path/to/filelet state ← Gzip.DeflateState.new (level := 6)
let compressed ← state.push chunk1
let compressed2 ← state.push chunk2
let final ← state.finish -- must call exactly once
let state ← Zstd.CompressState.new (level := 3)
let compressed ← state.push chunk
let final ← state.finishlet crc ← Checksum.crc32 0 data -- CRC-32
let adler ← Checksum.adler32 1 data -- Adler-32
-- Incremental: pass previous result as init
let crc2 ← Checksum.crc32 crc moreData-- Create .tar.gz from a directory (streaming, bounded memory)
Tar.createTarGz "/tmp/archive.tar.gz" "/path/to/dir"
-- Extract .tar.gz
Tar.extractTarGz "/tmp/archive.tar.gz" "/tmp/output"
-- Create/extract raw .tar via IO.FS.Stream
Tar.createFromDir stream dir
Tar.extract stream outDir
-- List entries without extracting
let entries ← Tar.list streamTar supports UStar, PAX extended headers (for long paths, large files, UTF-8), and GNU long name/link extensions. Paths exceeding UStar limits are automatically encoded with PAX headers on creation.
-- Create from explicit file list
Archive.create "/tmp/archive.zip" #[
("name-in-zip.txt", "/path/on/disk.txt"),
("subdir/file.bin", "/other/file.bin")
]
-- Create from directory
Archive.createFromDir "/tmp/archive.zip" "/path/to/dir"
-- Extract all files
Archive.extract "/tmp/archive.zip" "/tmp/output"
-- Extract a single file by name
let data ← Archive.extractFile "/tmp/archive.zip" "name-in-zip.txt"
-- List entries
let entries ← Archive.list "/tmp/archive.zip"ZIP supports stored (method 0) and deflated (method 8) entries with automatic method selection, CRC32 verification, and ZIP64 extensions for archives exceeding 4GB or 65535 entries.
lake buildlake build test && .lake/build/bin/testA benchmark driver is included for use with hyperfine:
lake build bench
# Single benchmark
hyperfine 'lake exe bench inflate 1048576 prng 6'
# Compare patterns
hyperfine 'lake exe bench inflate 1048576 constant 6' \
'lake exe bench inflate 1048576 prng 6'
# Sweep sizes
hyperfine --parameter-list size 1024,65536,1048576 \
'lake exe bench inflate {size} prng 6'Operations: inflate, gzip, zlib, crc32, adler32, crc32-ffi, adler32-ffi.
Patterns: constant, cyclic, prng. Level: 0-9 (default 6).
- TOCTOU in extraction: archive extraction creates parent directories then writes files; a local attacker could replace a directory with a symlink between these steps. Fixing this requires
openat()/O_NOFOLLOW, which Lean's stdlib doesn't expose. - No streaming output limit: whole-buffer decompression accepts a
max_outputcap, but the streaming API (push/finish) has no built-in limit — callers processing untrusted data should track total output themselves.
Apache-2.0. See LICENSE.