Lossless GIF recompression via exhaustive search

Exploring the depths of GIFs, flexiGIF, and the mechanics of LZW compression. Date: June 22, 2026 | Author: Arusekk | Read Time: $\approx 8$ min

A Bit of History

The GIF format stands as the venerable ancestor of widely adopted compressed imagery. While contemporary users primarily associate it with short, looping animations, that specific utility is secondary to my current interests.

Historically, GIF was the sole image format supported by NCSA Mosaic. Consequently, if a developer truly desires absolute compatibility with the "ancient" web, a GIF fallback is mandatory for all essential visuals. I am referring to the era of:

Netscape
Internet Explorer (early versions)
Netsurf
Dillo
Konqueror

You can experience these today via oldweb.today. While most developers ignore these browsers, I found it a rewarding challenge to ensure the 1-Click Linux website remained visually acceptable on them, leading me to implement a <picture> element with a GIF fallback.

The Philosophy of Compatibility

My guiding principle for production web features is simple: a feature should be widely compatible on its own, but it must possess exactly one fallback—the one that guarantees $\text{compatibility} = 100\%$ .

In the year 2026, the standard toolkit should ideally be limited to:

SVG (for vectors)
WebP (lossy for photography, lossless for low-palette graphics)

We should move past ~~PNG~~, ~~JPEG~~, and certainly avoid proprietary nightmares like AI, DWG, or the dreaded DICOM.

Format	Use Case (2026)	Compatibility
SVG	Logos/Icons	High
WebP	General Imagery	High
GIF	Absolute Fallback	Universal
DICOM	Medical (Proprietary)	Low/Specialized

Furthermore, legacy hardware (like a Nokia phone with a $240 \times 320$ screen) necessitates small assets. A $128 \times 128$ icon is safe; $256 \times 256$ might exceed the entire viewport.

The Art of Image Optimization

Optimization is a layered process. Most people start with the basics:

Strip all metadata
Eliminate unused palette colors
Prune rarely used palette colors
Optimize the actual compression stream $\leftarrow$ The overlooked step

DEFLATE and the Zopfli Approach

PNG utilizes DEFLATE (the engine behind .zip and .gz). A key characteristic of DEFLATE is that a single set of uncompressed data can be represented by multiple valid compressed streams.

While the standard DEFLATE algorithm is efficient, it is not necessarily optimal. There exist syntactically correct streams that the standard algorithm would never generate—some of which are smaller than the "maximum" compression setting.

This is where Zopfli comes in. It performs an exhaustive search across all valid streams to find the absolute minimum bit-count. ZopfliPNG extends this logic to PNG's specific pixel encodings.

Note on Computability: Finding the absolute best compression for any arbitrary format could theoretically be as difficult as solving the Halting Problem: $\text{If } f(x) \text{ is the compression function, finding } \min(\text{size}(f(x))) \text{ may not terminate.}$ Fortunately, the formats we use have invariants that ensure the search eventually ends.

Enter flexiGIF and LZW

While there is no "ZopfliGIF," we have flexiGIF, which pursues a similar goal. It is a fantastic tool, though its README contains a curious admission:

"It can leave files larger than the original algorithm."

Since it was written by a human, this is understandable. To understand why, we have to look at LZW (Lempel-Ziv-Welch).

Understanding LZW

Older papers describe LZW by focusing on the algorithm. However, for those of us focused on interoperability, the data format is what matters. If we keep the format but improve the encoder, the decoder remains happy.

In simple terms, an LZW stream is a sequence of actions:

Yield this byte
Repeat a previous action, then add the first byte of the following action

(The specific case of repeating the most recent action is referred to as KwKwK in the literature. GIF also includes "clear state" and "end of data" markers, but those are trivial here.)

The standard LZW algorithm is greedy: it always picks the longest previous match. This is logical because:

Local optimization often leads to global optimization.
It consistently expands the vocabulary.

Greedy vs. Non-Greedy: An Example

Consider this input stream: a b a b a b a a b a a b a a a b

The Greedy Path (Standard)

1. a
2. b
3. a-b (Match 'ab')
4. a-b-a (Match 'aba')
5. a-b-a-a (Match 'abaa')
6. b-a (Match 'ba')
7. a-b-a-a (Match 'abaa')
8. a-b (Match 'ab')

The Non-Greedy Path (Optimized)

By intentionally choosing a shorter match now, we can create a more useful "word" for later.

In the non-greedy example, the encoder might choose a shorter sequence at step 6, which allows a much larger sequence to be captured at step 7, ultimately resulting in one less action overall.

This proves that the greedy approach does not always reach the furthest—or smallest—possible representation.