Teradata Multivalue Compression and Presence Bytes for NULL Values

Multivalue compression and null indication serve distinct purposes yet share a common objective of storing information about the values of each row’s columns. To conserve space, Teradata utilizes presence bytes to retain information on multivalue compression and nullability.

Readers familiar with earlier releases will recognize that the name “presence byte” originally referred only to NULL values. This naming convention was introduced when only nullability information was stored, before multivalue compression existed.

Each row is always equipped with at least one presence byte. More presence bytes are added if needed. Only 7 bits of the first presence byte store information about nullability or multivalue compression, as the system occupies the first bit (and it’s always set to 1).

How is Teradata storing Information about NULL Values?

One bit per row finds NULL values for each nullable column. The bits used to show nullability are always stored in the bits used by the multivalue compression feature (which means the bits are used from right to left). Of course, columns defined as NOT NULL don’t need the nullability indicator bit.

The logic of indicating NULL values is simple:

  • If the presence bit is set to 1, a NULL value is present in the column.
  • If the presence bit is set to 0, any other value is present in the column.

See below an example table containing eight rows. As TheCol is nullable, an indicator bit is used to show rows containing NULL values in TheCol (as mentioned earlier, the first bit is always set to 1 as the system utilizes it):

Compress3

We already said that only 7 bits of the first presence byte could be used. We can cover seven nullable columns with the always-existing presence byte. Each additional presence byte can store the null value information of 8 more nullable columns.

An extra presence byte is added if we increase the number of nullable columns from 7 to 8. As a result, each row consumes one more byte of permanent disk space.
Keep the following limits in mind when designing your tables:

Up to 7   nullable columns -> The already existing presence byte is used.
Up to 15 nullable columns -> 1 presence byte is added (2 in total)
Up to 23 nullable columns -> 2 presence bytes are added (3 in total)

To minimize table size, it is advisable to define columns as NOT NULL whenever feasible. The multivalue compression feature can also utilize unused bits in a presence byte.

Teradata Multivalue Compression and the Presence Bytes

Presence bytes are also used in multivalue compression. For each column, up to 255 different values can be compressed.

The original column values are kept in the table header, and each row will only hold the binary number representing its corresponding column value. If we store 255 different values, 8 bits are needed (a total presence byte).

The algorithm will only occupy as many bits as needed to encode the original column value:

Different ValuesBits needed
 1 1
 3 2
 7 3
 15 4
 31 5
 63 6
 127 7
 255 8

Let’s do another example. The below table has eight rows. If we compress 3 column values ‘A’,’B’,’C’, we need two presence bits: ‘A’ could be represented by the binary number 01,  ‘B’ by binary 10, ‘C’ by binary 11 (the presence bits are occupied from right to left):

Compress1

We extend our example by making TheCol additionally nullable. In this case, one presence bit is needed to show rows containing NULL values in column TheCol. The compression is unchanged, just shifted by one bit to the left, as nullability bits are always stored first:

Compress2
When we compress at least one value of TheCol, NULL values are automatically compressed without any more cost. The column’s nullability bit can be used for NULL values. No encoding and storing in the table header is required.

To decrease space usage, employing identical presence bytes for multivalue compression and nullability is possible. However, it’s crucial to consider the combined expenses of compressed columns and nullable indicators when constructing tables and how adding another compressed value or nullable column may impact table size.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

2 thoughts on “Teradata Multivalue Compression and Presence Bytes for NULL Values”

  1. Jayant, that right most bit is used by the system and will always be populated as 1 as you can see in all other examples

    Reply
  2. Good morning Sir,

    Very nice explanation of multivalue compression here. kudos to you. I am a beginner at Teradata, but I like to learn new things.

    One question, I had in the example of multivalue compression (with the not nullable condition). When we represent the value ‘A’ by 01 in the first row of the table ‘The table’. My question is why the presence byte is looking like 00001011. What is the fourth ‘1’ (right to left) signifying there?

    your response is greatly appreciated.

    Reply

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.