← Back to news

How TimescaleDB compresses time-series data

roszigit.com|132 points|15 comments|by lkanwoqwp|Jun 15, 2026

Understanding TimescaleDB Data Compression

Key Metric: TimescaleDB is capable of reducing the storage footprint of typical time-series datasets by as much as 98%.

Standard OLTP databases utilize general-purpose compression, but time-series data requires a specialized strategy. TimescaleDB achieves its massive efficiency via the hypercore engine—a hybrid system that blends row-based and columnar storage using advanced algorithms like Gorilla XOR, delta-of-delta, delta encoding, and run-length encoding (RLE).


TimescaleDB vs. PostgreSQL TOAST

It is important to distinguish between TimescaleDB's compression and the native PostgreSQL TOAST (The Oversized-Attribute Storage Technique). While they both reduce size, they target different problems.

  • TOAST is designed for "vertical" bloat: individual fields that are too large (e.g., a massive jsonb blob) to fit in a standard 8 kB page.
  • Hypercore is designed for "horizontal" bloat: repeating patterns across millions of rows.

These systems are complementary; in fact, TimescaleDB may use TOAST as a fallback for specific data types.

Comparison Matrix: TOAST vs. Hypercore

FeatureTOAST (Vanilla Postgres)TimescaleDB Hypercore
Primary GoalHandle values >2 KB> 2\text{ KB}Optimize time-series patterns
TriggerValue exceeds TOAST_TUPLE_THRESHOLDPolicy-based (e.g., data >7 days old> 7\text{ days old})
ScopeVariable-length types (text, bytea)All data types
Algorithmspglz, lz4Delta, Delta-of-Delta, RLE, XOR, Dictionary
GranularityPer individual valuePer batch (1000\approx 1000 rows)
Data AwarenessOpaque byte streamAware of monotonicity and numeric structure
Float Ratio1.0×\approx 1.0\times (None)1020×10\text{--}20\times
Timestamp Ratio1.0×\approx 1.0\times (None)50100×50\text{--}100\times
Text Ratio23×2\text{--}3\times510×5\text{--}10\times

The Hypercore Engine: A Hybrid Approach

The hypercore engine manages a transition of data states to balance write speed with read efficiency.

The Transformation Process

When a chunk is compressed, the engine reorganizes the data:

  • Rows are grouped into batches of up to 1,000.
  • Each batch is converted into a single row in a compressed table.
  • Columns are transformed into compressed arrays.

This column-major format allows the database to fetch only the specific fields required for a query, rather than scanning entire rows of irrelevant data.


Deep Dive: Compression Algorithms

1. Delta Encoding

Instead of storing the full value of every data point, delta encoding stores the difference between the current value and the previous one.

Mathematical Representation: Δ=xnxn1\Delta = x_n - x_{n-1}

Example: Imagine a sensor recording temperature. Instead of storing 72.5, 72.7, 72.4, the system stores:

TimeMachineTypeValue\rightarrowCompressed Value
12:00:00M_001temp72.5\rightarrow72.5 (Base)
12:00:05M_001temp72.7\rightarrow+0.2
12:00:10M_001temp72.4\rightarrow-0.3

2. Delta-of-Delta Encoding

For timestamps that occur at regular intervals (e.g., every 5 seconds), even the delta is repetitive. Delta-of-delta stores the change in the change.

Mathematical Representation: Δ2=ΔnΔn1\Delta^2 = \Delta_n - \Delta_{n-1}

If the interval is always 5 seconds, the Δ\Delta is always 5, and the Δ2\Delta^2 is 0. Storing a zero requires significantly fewer bits than storing a timestamp.

3. Run-Length Encoding (RLE)

RLE is ideal for columns with low cardinality or high repetition, such as device_id or status. Instead of repeating the same string 1,000 times, it stores the value once and a count of its repetitions.

Data Transformation: MACHINE_001, MACHINE_001, MACHINE_001, MACHINE_001 \rightarrow (MACHINE_001, 4)

-- Conceptual representation of a compressed batch
SELECT 
    ARRAY['12:00:00', '12:00:05', '12:00:10'] as time,
    ARRAY['MACHINE_001'::text, 3] as machine_id, -- RLE: Value + Count
    ARRAY[72.5, 0.2, -0.3] as temp_delta;       -- Delta Encoding

By combining these techniques, TimescaleDB transforms bulky, row-oriented time-series data into a lean, columnar format that slashes storage costs and accelerates analytical performance.