Lossless GIF recompression via exhaustive search
Lossless GIF recompression via exhaustive search
Exploring the depths of GIFs, flexiGIF, and the mechanics of LZW compression. Date: June 22, 2026 | Author: Arusekk | Read Time: min
A Bit of History
The GIF format stands as the venerable ancestor of widely adopted compressed imagery. While contemporary users primarily associate it with short, looping animations, that specific utility is secondary to my current interests.
Historically, GIF was the sole image format supported by NCSA Mosaic. Consequently, if a developer truly desires absolute compatibility with the "ancient" web, a GIF fallback is mandatory for all essential visuals. I am referring to the era of:
- Netscape
- Internet Explorer (early versions)
- Netsurf
- Dillo
- Konqueror
You can experience these today via oldweb.today. While most developers ignore these browsers, I found it a rewarding challenge to ensure the 1-Click Linux website remained visually acceptable on them, leading me to implement a <picture> element with a GIF fallback.
The Philosophy of Compatibility
My guiding principle for production web features is simple: a feature should be widely compatible on its own, but it must possess exactly one fallback—the one that guarantees .
In the year 2026, the standard toolkit should ideally be limited to:
- SVG (for vectors)
- WebP (lossy for photography, lossless for low-palette graphics)
We should move past PNG, JPEG, and certainly avoid proprietary nightmares like AI, DWG, or the dreaded DICOM.
| Format | Use Case (2026) | Compatibility |
|---|---|---|
| SVG | Logos/Icons | High |
| WebP | General Imagery | High |
| GIF | Absolute Fallback | Universal |
| DICOM | Medical (Proprietary) | Low/Specialized |
Furthermore, legacy hardware (like a Nokia phone with a screen) necessitates small assets. A icon is safe; might exceed the entire viewport.
The Art of Image Optimization
Optimization is a layered process. Most people start with the basics:
- Strip all metadata
- Eliminate unused palette colors
- Prune rarely used palette colors
- Optimize the actual compression stream The overlooked step
DEFLATE and the Zopfli Approach
PNG utilizes DEFLATE (the engine behind .zip and .gz). A key characteristic of DEFLATE is that a single set of uncompressed data can be represented by multiple valid compressed streams.
While the standard DEFLATE algorithm is efficient, it is not necessarily optimal. There exist syntactically correct streams that the standard algorithm would never generate—some of which are smaller than the "maximum" compression setting.
This is where Zopfli comes in. It performs an exhaustive search across all valid streams to find the absolute minimum bit-count. ZopfliPNG extends this logic to PNG's specific pixel encodings.
Note on Computability: Finding the absolute best compression for any arbitrary format could theoretically be as difficult as solving the Halting Problem: Fortunately, the formats we use have invariants that ensure the search eventually ends.
Enter flexiGIF and LZW
While there is no "ZopfliGIF," we have flexiGIF, which pursues a similar goal. It is a fantastic tool, though its README contains a curious admission:
"It can leave files larger than the original algorithm."
Since it was written by a human, this is understandable. To understand why, we have to look at LZW (Lempel-Ziv-Welch).
Understanding LZW
Older papers describe LZW by focusing on the algorithm. However, for those of us focused on interoperability, the data format is what matters. If we keep the format but improve the encoder, the decoder remains happy.
In simple terms, an LZW stream is a sequence of actions:
Yield this byteRepeat a previous action, then add the first byte of the following action
(The specific case of repeating the most recent action is referred to as KwKwK in the literature. GIF also includes "clear state" and "end of data" markers, but those are trivial here.)
The standard LZW algorithm is greedy: it always picks the longest previous match. This is logical because:
- Local optimization often leads to global optimization.
- It consistently expands the vocabulary.
Greedy vs. Non-Greedy: An Example
Consider this input stream:
a b a b a b a a b a a b a a a b
The Greedy Path (Standard)
1. a
2. b
3. a-b (Match 'ab')
4. a-b-a (Match 'aba')
5. a-b-a-a (Match 'abaa')
6. b-a (Match 'ba')
7. a-b-a-a (Match 'abaa')
8. a-b (Match 'ab')
The Non-Greedy Path (Optimized)
By intentionally choosing a shorter match now, we can create a more useful "word" for later.
In the non-greedy example, the encoder might choose a shorter sequence at step 6, which allows a much larger sequence to be captured at step 7, ultimately resulting in one less action overall.
This proves that the greedy approach does not always reach the furthest—or smallest—possible representation.