Teradata Multivalue And Blocklevel Compression Together? | DWHPRO

Teradata Multivalue And Blocklevel Compression Together?

By way of introduction, I would like to mention that Teradata block-level compression is often performed by default nowadays. Therefore the considerations given in this blog post are only useful on Teradata systems where this is not the case.

Since Teradata Block Level Compression has been around, I often see that multi-value compression is not used.

First I would like to point out that Teradata MVC and BLC typically have different use cases.

Teradata MVC is a tool to improve the performance of your Teradata SQL queries and save mass storage space.

BLC is there to save space, but it is not intended to be a performance tuning tool, because unpacking the data requires CPU seconds. Therefore the cold data is usually compressed with BLC (i.e. the data that is not used frequently).

Here is an example that shows how Teradata Multivalue Compression can be used together with BLC.

Comparison Of Combinations Of BLC and MVC

We will look at all four possible combinations of compression and then see what these results mean for us.

For Multivalue Compression we have chosen the Compress list so that all values are “compressed away”:

These are our four test tables:

SET QUERY_BAND = 'BLOCKCOMPRESSION=NO;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_MVC
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100) COMPRESS ('PLATIN','GOLDEN','SILVER','BRONZE‘)
)
PRIMARY INDEX ( CustomerId );
SET QUERY_BAND = 'BLOCKCOMPRESSION=NO;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_NONE
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100)
)
PRIMARY INDEX ( CustomerId );
SET QUERY_BAND = 'BLOCKCOMPRESSION=YES;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_BLC_MVC
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100) COMPRESS ('PLATIN','GOLDEN','SILVER','BRONZE‘) 
)
PRIMARY INDEX ( CustomerId );
SET QUERY_BAND = 'BLOCKCOMPRESSION=YES;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_BLC
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100)
)
PRIMARY INDEX ( CustomerId );

In each of these tables, we add 10 million rows. As mentioned above, it is important that the column CategoryCd only contains the values ‘PLATIN', ‘GOLDEN', ‘SILVER', ‘BRONZE':

Comparing the Disk IOs

In the next figure you can see the Disk IOs required by a full table scan of the tables:

Teradata BLC
IOs required for each combination of compression

What can be derived from this for our Compression Strategy from a Disk IO point of view (Caution: we only look at the IOs)?

  • Not using compression is always bad
  • Block-level compression should be combined with MVC

Please note that in our test scenario the selected MVC was optimized for the existing values. This should anyway be the goal when using MVC.

Comparing the CPU Seconds

Now let's take a look at what the different ways of compressing have an effect on the number of CPU seconds needed:

teradata MVC
Adding BLC to the optimized MVC has no benefit on IOs but increases the used CPU seconds.
  • Not using compression is always bad
  • Block-level compression increases always the CPU seconds needed and should only be used for cold data

Comparing the Compression Rate

  • Not using compression is always bad
  • BLC compression together with MVC compression is almost the same as BLC alone.
  • The optimized MVC us not as efficient as BLC or a combination of BLC and MVC

That's what I deduce:

Since only tables that are rarely queried are block-level compressed, a combination with MVC is not necessarily advantageous as querying them requires additional CPU.

On the other hand, if BLC tables are often queried then one should rather ask the question if it is not better to compress them only with MVC.

Please leave a comment if you have other experiences or a different opinion on this topic.

DWH Pro Admin
 

>