TigerData: how do I set up compression and retention policies for telemetry tables?
Time-Series Databases

TigerData: how do I set up compression and retention policies for telemetry tables?

9 min read

Most telemetry workloads on TigerData follow the same playbook: ingest fast, keep “hot” data fully queryable, compress older data aggressively, and eventually drop what you don’t need. Compression and retention policies are how you encode that lifecycle directly into PostgreSQL/Timescale so it runs hands‑off in production.

This guide walks through how to set up compression and retention for telemetry tables on Tiger Cloud (TimescaleDB under the hood), with concrete SQL you can paste into psql or your migration scripts.


The Quick Overview

  • What It Is: Compression and retention policies are TimescaleDB background jobs that automatically compress older telemetry chunks and delete data past your chosen retention window.
  • Who It Is For: Teams running high-ingest time-series/telemetry tables (sensor data, events, metrics, ticks) on TigerData who want predictable performance and storage costs.
  • Core Problem Solved: Plain Postgres tables grow without bound, making queries slow and storage expensive. Compression and retention keep hypertables fast and lean over time—without manual cleanup jobs or cron scripts.

How It Works (High Level)

On TigerData, telemetry data should live in hypertables, not plain tables. TimescaleDB then:

  1. Partitions data into time-based chunks (and optionally by device/customer key).
  2. Compresses old chunks into columnar storage once they age past a threshold.
  3. Drops very old chunks according to a retention policy.

All of this is handled by background workers triggered by policies you configure via SQL.

Workflow:

  1. Create/verify a hypertable for your telemetry table.
  2. Enable compression and define which columns stay uncompressed.
  3. Add a compression policy with a time threshold (e.g., compress data older than 7 days).
  4. Add a retention policy with a time threshold (e.g., drop data older than 90 days).
  5. Monitor with compression stats and adjust thresholds as your traffic and costs evolve.

Step 1: Make Sure You’re Using a Hypertable

Compression and retention policies operate on hypertables, not regular tables.

If you’re starting from scratch:

CREATE TABLE telemetry_raw (
    time        TIMESTAMPTZ       NOT NULL,
    device_id   TEXT              NOT NULL,
    reading     DOUBLE PRECISION  NOT NULL,
    status      TEXT,
    metadata    JSONB,
    PRIMARY KEY (time, device_id)
);

SELECT create_hypertable('telemetry_raw', 'time');

If the table already exists, check:

SELECT hypertable_name
FROM timescaledb_information.hypertables
WHERE hypertable_name = 'telemetry_raw';

If no row is returned, convert it:

SELECT create_hypertable('telemetry_raw', 'time', migrate_data => true);

Important: For high-ingest telemetry, always partition on the timestamp column used in your queries (e.g., time), and consider a partitioning key like device_id or tenant_id if you have many independent streams.


Step 2: Enable Compression on the Hypertable

Compression uses columnar storage for older chunks while keeping recent chunks in row format for fast writes. You must explicitly enable it:

ALTER TABLE telemetry_raw
SET (
  timescaledb.compress = true
);

Choose “segment by” and “order by” Columns

To get good compression ratios and scan speed, configure:

  • segmentby: columns that identify a series or group (e.g., device_id, tenant_id).
  • orderby: how rows are ordered inside the compressed chunk (typically time descending).
SELECT add_compression_policy; -- this exists, but you configure columns separately

Configure compression settings:

SELECT alter_table_set_compression(
  'telemetry_raw',
  segmentby => ARRAY['device_id'],
  orderby   => ARRAY['time DESC']
);

Note: You can also pass multiple segment columns for multi-tenant setups, e.g. ARRAY['tenant_id', 'device_id'].


Step 3: Add a Compression Policy

A compression policy tells TimescaleDB: “When a chunk gets older than X, compress it.”

Example: compress data older than 7 days:

SELECT add_compression_policy(
  hypertable => 'telemetry_raw',
  compress_after => INTERVAL '7 days'
);

Under the hood:

  • Data < now() - 7 days is eligible for compression.
  • The background job scans for uncompressed chunks older than that threshold and compresses them in batches.
  • New data continues to be written to uncompressed chunks.

Choosing compress_after

Typical telemetry patterns:

  • High QPS dashboards:
    • Keep 1–3 days uncompressed for very low-latency queries.
    • compress_after => INTERVAL '3 days'.
  • Moderate traffic, more historic analysis:
    • compress_after => INTERVAL '7 days' or '14 days'.
  • Primarily batch analytics:
    • Aggressive: compress_after => INTERVAL '1 day'.

Important: Compression trades some write/update flexibility for space and scan speed. Avoid compressing data that you frequently UPDATE or DELETE.


Step 4: Add a Retention Policy

Retention policies automatically drop old chunks. This is your guardrail against unbounded growth.

Example: keep 90 days of telemetry and drop older:

SELECT add_retention_policy(
  relation => 'telemetry_raw',
  drop_after => INTERVAL '90 days'
);

Behavior:

  • Chunks where all rows are older than now() - 90 days become eligible for deletion.
  • A background job drops whole chunks—no row-by-row delete, so it’s efficient.

Choosing drop_after

Consider:

  • Regulatory/compliance needs (e.g., 1 year for audits).
  • Business value (do you query a year back, or just 30/90 days?).
  • Storage budget.

Common patterns:

  • Ops/monitoring metrics: 30–90 days.
  • IoT telemetry with seasonal analysis: 365+ days.
  • Trading/event ticks: maybe 7–30 days at tick level, with aggregated tables retained longer (see “Continuous aggregates” below).

Warning: Retention is destructive. Once chunks are dropped, you cannot recover them unless you restore from backup or PITR. Always align drop_after with your backup/restore window and compliance requirements.


Step 5: Combine Compression, Retention, and Continuous Aggregates

For telemetry, a good pattern is:

  1. Raw hypertable (telemetry_raw):

    • Minimal indexes, optimized for ingest.
    • Compression + retention.
  2. Continuous aggregates (telemetry_5m, telemetry_1h):

    • Pre-aggregated rollups with longer retention.
    • Backed by the raw hypertable but much cheaper to query.

Example: 5‑minute rollup per device:

CREATE MATERIALIZED VIEW telemetry_5m
WITH (timescaledb.continuous) AS
SELECT
  time_bucket(INTERVAL '5 minutes', time) AS bucket,
  device_id,
  avg(reading)      AS avg_reading,
  max(reading)      AS max_reading,
  min(reading)      AS min_reading,
  count(*)          AS samples
FROM telemetry_raw
GROUP BY bucket, device_id;

Add a refresh policy:

SELECT add_continuous_aggregate_policy(
  'telemetry_5m',
  start_offset => INTERVAL '7 days',
  end_offset   => INTERVAL '5 minutes',
  schedule_interval => INTERVAL '5 minutes'
);

Retention on the rollup (keep 1 year of 5-minute data):

SELECT add_retention_policy(
  relation   => 'telemetry_5m',
  drop_after => INTERVAL '365 days'
);

Note: A common pattern is short retention on raw, longer retention on aggregates. For example: 30 days raw, 365 days 5-minute rollups.


Operational Considerations on Tiger Cloud

Scheduling and Load

Policies run via TimescaleDB’s job scheduler:

  • Jobs are configured and visible in timescaledb_information.jobs.
  • You can adjust schedule intervals to spread load.

Example: view jobs:

SELECT job_id, proc_name, scheduled, hypertable_name
FROM timescaledb_information.jobs
LEFT JOIN timescaledb_information.job_stats USING (job_id)
ORDER BY job_id;

If you see compression/retention falling behind (e.g., uncompressed chunks older than your compress_after), shorten the policy schedule_interval or increase service compute in Tiger Cloud.

Indexes and Compression

Some index types are not needed (or useful) once data is compressed:

  • TimescaleDB maintains internal index structures for compressed chunks.
  • You can often drop secondary indexes on columns you only query in compressed ranges—test both ways on a staging copy.

Important: Don’t disable primary keys or constraints you depend on. For migrations, it’s fine to temporarily disable some indexes/constraints for bulk loading and rebuild after.

Tiered Storage (Tiger Lake / Tiering)

On Tiger Cloud, compressed chunks can be tired to lower-cost object storage, depending on your plan and configuration. Compression helps even more here:

  • Compression can reduce storage by up to 98% on telemetry workloads.
  • Retention ensures you’re not paying for data you never query.
  • Tiering lets you keep long history at minimal cost.

Talk to TigerData support or check your Tiger Console to see tiering options available for your service/plan.


Example: End-to-End Policy Setup for a Telemetry Table

Assume:

  • Ingest: millions of rows/day.
  • Dashboard queries: mostly last 48 hours.
  • You occasionally debug up to 30 days back.
  • You keep aggregated stats for 1 year.

SQL outline:

-- 1) Create hypertable (if not already)
CREATE TABLE telemetry_raw (
    time        TIMESTAMPTZ       NOT NULL,
    device_id   TEXT              NOT NULL,
    reading     DOUBLE PRECISION  NOT NULL,
    status      TEXT,
    metadata    JSONB,
    PRIMARY KEY (time, device_id)
);

SELECT create_hypertable('telemetry_raw', 'time');

-- 2) Enable compression
ALTER TABLE telemetry_raw
SET (timescaledb.compress = true);

SELECT alter_table_set_compression(
  'telemetry_raw',
  segmentby => ARRAY['device_id'],
  orderby   => ARRAY['time DESC']
);

-- 3) Compression policy: compress after 3 days
SELECT add_compression_policy(
  hypertable    => 'telemetry_raw',
  compress_after => INTERVAL '3 days'
);

-- 4) Retention policy on raw: keep 30 days
SELECT add_retention_policy(
  relation   => 'telemetry_raw',
  drop_after => INTERVAL '30 days'
);

-- 5) Continuous aggregate for 5-minute rollups
CREATE MATERIALIZED VIEW telemetry_5m
WITH (timescaledb.continuous) AS
SELECT
  time_bucket(INTERVAL '5 minutes', time) AS bucket,
  device_id,
  avg(reading) AS avg_reading,
  max(reading) AS max_reading,
  min(reading) AS min_reading,
  count(*)     AS samples
FROM telemetry_raw
GROUP BY bucket, device_id;

-- 6) Policy to keep rollup fresh
SELECT add_continuous_aggregate_policy(
  'telemetry_5m',
  start_offset      => INTERVAL '30 days',
  end_offset        => INTERVAL '5 minutes',
  schedule_interval => INTERVAL '5 minutes'
);

-- 7) Retention on aggregate: keep 1 year
SELECT add_retention_policy(
  relation   => 'telemetry_5m',
  drop_after => INTERVAL '365 days'
);

This setup yields:

  • 0–3 days: hot, uncompressed raw data.
  • 3–30 days: compressed raw data (cheap + fast scans).
  • 0–365 days: 5-minute rollups for long-range queries.
  • 365 days: automatically dropped.


Monitoring and Tuning

You can inspect compression coverage and policy behavior with Timescale views.

Check Compression Stats

SELECT *
FROM timescaledb_information.compression_stats
WHERE hypertable_name = 'telemetry_raw';

Key things to watch:

  • compression_ratio: bigger is better (space saved).
  • before_compression_total_bytes vs after_compression_total_bytes: how much storage you’re saving.
  • number_compressed_chunks vs number_uncompressed_chunks: are older chunks actually being compressed?

Check Hypertable and Chunk State

SELECT hypertable_name, chunk_name, range_start, range_end, is_compressed
FROM timescaledb_information.chunks
WHERE hypertable_name = 'telemetry_raw'
ORDER BY range_start DESC;

If you see is_compressed = false for chunks older than your policy threshold:

  • Ensure the compression policy exists and is not paused.
  • Check job stats for errors.
  • Consider running manual compression for backlogs:
SELECT compress_chunk(ch.chunk_name)
FROM timescaledb_information.chunks ch
WHERE ch.hypertable_name = 'telemetry_raw'
  AND ch.is_compressed = false
  AND ch.range_end < now() - INTERVAL '3 days';

Warning: Manual compress_chunk calls are fine, but don’t forget to keep the policy enabled so new chunks also get compressed.


Limitations and Edge Cases

  • Frequent updates on compressed data:
    Updates on compressed chunks are expensive; Timescale will decompress/recompress under the hood. If you must update frequently, keep that data in uncompressed range (increase compress_after).
  • Retention and foreign keys:
    If other tables reference telemetry rows, chunk drops can fail. Consider using logical delete flags or retaining only reference keys in related tables.
  • Backfill data:
    Backfilled timestamps that fall into “compressed” ranges will land in compressed chunks. Plan backfill windows carefully or temporarily adjust policies (e.g., increase compress_after) while backfilling.

Summary

For telemetry tables on TigerData, the operational pattern is:

  • Convert your telemetry tables into hypertables.
  • Enable compression and choose sensible segmentby and orderby keys.
  • Add a compression policy to automatically compress older data.
  • Add a retention policy to drop data that’s past its useful life.
  • Use continuous aggregates to retain long-range rollups while keeping raw data history short and cheap.
  • Monitor with Timescale’s compression_stats, chunks, and jobs views and adjust thresholds as your workload evolves.

Done right, you get:

  • High ingest rates with predictable, low query latency.
  • Up to 98% compression on telemetry workloads.
  • Storage and cost that grow with business value, not with raw data volume.

If you want help designing policies tuned to your ingest rates and query patterns, you can talk directly with the TigerData team.

Get Started