← Back to news

Image Compression

makingsoftware.com|89 points|12 comments|by vinhnx|Jun 14, 2026

Understanding Image Compression

Image compression is the fundamental process of reducing the file size of a digital image. By decreasing the amount of data required to represent an image, we can optimize storage space and accelerate transmission speeds across networks.

Definition: Image compression is the art and science of encoding image data in a way that allows the original (or a visually similar version) to be reconstructed using fewer bits.


1. The Two Primary Paradigms

Compression techniques are broadly categorized into two distinct methodologies: Lossless and Lossy.

A. Lossless Compression

Lossless compression ensures that the original data is reconstructed perfectly bit-for-bit. No information is discarded during the process.

  • Best for: Text-heavy images, medical imaging, and professional photography (RAW).
  • Common Formats: PNG, GIF, TIFF.
  • Mechanism: It identifies and eliminates statistical redundancy.

B. Lossy Compression

Lossy compression achieves much higher compression ratios by permanently discarding "unnecessary" or less perceptible information.

  • Best for: Web graphics, social media, and general photography.
  • Common Formats: JPEG, WebP.
  • Mechanism: It exploits the limitations of human vision (psycho-visual redundancy).

2. Technical Deep Dive: How it Works

Lossless Algorithms

Lossless methods often use a combination of the following:

  1. Run-Length Encoding (RLE): Replaces sequences of identical pixels with a single value and a count.
    • Example: AAAAABBBCC \rightarrow 5A3B2C
  2. Huffman Coding: Assigns shorter binary codes to frequently occurring pixel values and longer codes to rare ones.
  3. LZW (Lempel-Ziv-Welch): Builds a dictionary of recurring patterns.

Example: Simple RLE Implementation

def compress_rle(data):
    encoding = ""
    i = 0
    while i < len(data):
        count = 1
        while i + 1 < len(data) and data[i] == data[i+1]:
            i += 1
            count += 1
        encoding += str(count) + data[i]
        i += 1
    return encoding

# Input: "WWWWWBWWWW" -> Output: "5W1B4W"

Lossy Algorithms (The JPEG Pipeline)

Lossy compression, specifically in JPEG, follows a complex mathematical pipeline:

  1. Color Space Transformation: Converting RGB to YCbCr\text{YCbCr} (Luminance and Chrominance).
  2. Chroma Subsampling: Reducing the resolution of color information since humans are more sensitive to brightness than color.
  3. Discrete Cosine Transform (DCT): Converting spatial data into frequency data.
  4. Quantization: The "lossy" step where high-frequency coefficients are divided by a constant and rounded to the nearest integer.
  5. Entropy Coding: Final compression using Huffman or Arithmetic coding.

The Mathematics of DCT

The 2D Discrete Cosine Transform is represented by the formula: F(u,v)=14C(u)C(v)x=07y=07f(x,y)cos[(2x+1)uπ16]cos[(2y+1)vπ16]F(u, v) = \frac{1}{4} C(u) C(v) \sum_{x=0}^{7} \sum_{y=0}^{7} f(x, y) \cos \left[ \frac{(2x+1)u\pi}{16} \right] \cos \left[ \frac{(2y+1)v\pi}{16} \right] Where f(x,y)f(x,y) is the pixel intensity and F(u,v)F(u,v) is the frequency coefficient.


3. Visualizing the Workflow

The following diagram illustrates the standard lossy compression pipeline:


4. Comparative Analysis

FeatureLossless CompressionLossy Compression
Data Integrity100% PreservedPartially Discarded
File SizeRelatively LargeSignificantly Small
Generation LossNoneIncreases with every save
Primary UseArchiving / EditingWeb / Distribution
ComplexityLowerHigher

5. Practical Considerations

When choosing a compression method, consider the following checklist:

  • Does the image contain sharp edges or text? \rightarrow Use Lossless
  • Is the image a complex photograph? \rightarrow Use Lossy
  • Is bandwidth a critical constraint? \rightarrow Use Lossy
  • Is the image intended for medical diagnosis? \rightarrow Use Lossless

Common Misconceptions

Higher compression always means lower quality. Actually, modern algorithms like WebP or HEIF can achieve higher compression than JPEG while maintaining superior visual quality.

Compression Comparison Figure 1: A conceptual representation of how artifacts appear as compression ratios increase.

Summary

Image compression is a balance between fidelity and efficiency. While lossless methods prioritize the former, lossy methods prioritize the latter. By understanding the underlying math—from RLE\text{RLE} to DCT\text{DCT}—developers and designers can choose the optimal format for their specific use case.