Teradata Columnar Compression Methods: Run-Length, Dictionary, and Delta Compression

Teradata Columnar compression provides two primary advantages: a decrease in permanent storage utilization and disk input/output operations.

There is typically an advantage over row compression when compressing columns due to limited data variance per column. Teradata Columnar employs various compression methods that leverage this property.

Teradata Columnar & Run-Length Encoding

This compression technique stores each column value once and supplements it with data about the subsequent rows with the same value. For instance, if a date column contains ‘2015-10-29’ in rows 100-200, the run-length encoding method would record this information as follows:

‘2015-10-29’;100-200

Teradata Columnar & Dictionary Encoding

If column values are not consecutively repeated, dictionary encoding can be used for compression. This involves storing compressed versions of each unique value in a dictionary. The entries in the dictionary are of a fixed length, making navigation simple. In Teradata Columnar, each container has its own dictionary.

Values for customers in the “Business” and “Private” segments can be saved as dictionary entries 1 and 2.

1, “Business”,2, “Private”

The column values will be stored as a sequence of alternating 1’s and 2’s, for example, 1, 2, 1, 2, 2, 2, 1, 1, 2, 1, and so on.

This mapping decreases disk space utilization by reducing the size of larger column values.

Teradata Columnar & Delta Compression

Teradata Columnar utilizes delta compression, a more sophisticated compression technique than the straightforward run-length and dictionary encoding methods.

If the column values in a container have a narrow range, only the deviation from the average container value will be saved. Consider the following column values:

10,20,50,100,20,10

Delta compression stores encoded information as follows (with an average column value of 35):

-25,-15,+15,+65,-15,-25

You may wonder about the benefit of using less space by replacing the original values with smaller numbers. The advantage lies in the data type. If the column container stores BIGINT values, the offsets can be saved as SMALLINT values, which frees up 6 bytes per row.

Conclusion

Teradata Columnar’s compression methods greatly surpass the typical multivalue compression (MVC) utilized for row-based data.

Teradata determines the compression algorithm utilized, although this can be altered. Compression techniques can differ between table columns or containers within a column. Multiple approaches can even be employed simultaneously within each column.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.