zrckr/xcompress

GitHub: zrckr/xcompress

Stars: 0 | Forks: 0

# `xcompress` The reverse-engineering project of the `xcompress` dynamic library - a Microsoft Xbox LZX compression codec. ## Status Both decompression and compression have been decompiled and implemented. ## Byte-matching behavior XNB assets from FEZ 1.12 (`Other.pak`) were used as test samples, extracted with [FEZRepacker](https://github.com/FEZModding/FEZRepacker) 1.3.0. Files were tested in pairs - compressed and uncompressed versions. | Test | Result | Diff | | ---------------------- | --------- | ---- | | Decompression | 2028/2028 | 0 | | Compression | 1816/2028 | 212 | | Compress -> Decompress | 2028/2028 | 0 | | Decompress -> Compress | 1816/2028 | 212 | ## Compression issues Although the code can decompress both original data and data produced by the decompressed version of `xcompress`, there is a difference in how the compression bytes are written. The encoder is functionally correct - round-trip decompression always succeeds - but it does not produce byte-for-byte identical output compared to the original library. ## Test suite ### Arguments --compressed Path to compressed XNB file(s) --decompressed Path to uncompressed XNB file(s) --verbose, -v Print detailed mismatch info (byte diffs, chunk analysis) --output-failed-compressed Save failed compression attempts as .fail files ### Building Build the `xcompress` DLL with Visual Studio (solution: `xcompress.sln`). The post-build step automatically copies `xcompress.dll` into `testsuite/bin//net10.0/`. Then run the test suite: dotnet run --project testsuite -- --compressed --decompressed ## Implementation details **Codec**: LZX (Lempel-Ziv-X) **Parameters used by the test suite:** | Parameter | Value | | -------------------------- | ------ | | Window size | 64 KB | | Compression partition size | 256 KB | | Block size | 32 KB | **Huffman trees per block:** - Main tree - 256 literals + position slots - Length tree - 249 symbols - Aligned offset tree - 8 symbols **Block types:** 1. Verbatim - Huffman-coded literals and matches 2. Aligned - Huffman + aligned offset encoding 3. Uncompressed - stored verbatim for incompressible data **Other features:** repeated match offset caching (3 slots), E8 translation (x86 preprocessing), sliding window pattern matching via binary search trees. ## License See `LICENSE.txt`.