Teradata MVC, BLC, Or Both?

DWH Pro Admin

June 5, 2022

minutes reading time


Introduction to Teradata Compression

A small message at the beginning: Teradata Block Level Compression is now always enabled and cannot be disabled. However, this article is still helpful on the latest Teradata systems with block-level compression. It shows to what extent multi-value compression (Teradata MVC) is still beneficial today.

Since Teradata introduced block-level compression, fewer and fewer Teradata sites use multi-value compression. Leaving out MVC seems to make sense at first sight. Analyzing the best candidates for MVC causes a lot of effort that often cannot be justified in today’s world, where mass storage is getting cheaper and cheaper, but consultancy is expensive.

Therefore, I would like to make one thing clear at the beginning: MVC and BLC have two different application areas.

MVC has two main goals. First, MVC can improve the performance of queries by allowing data to be kept and processed in compressed form in data blocks. Second, the use of MVC naturally reduces the required disk space.

BLC is not a tool for performance tuning. Teradata must decompress data blocks as soon as they are copied into the main memory to make processing possible. So it only helps save space, and therefore the primary application purpose is to use it for cold data (this is only a consideration if it can be disabled on your system).

Here is an example showing how we can use Teradata MVC with BLC.

Interaction of BLC and MVC

In principle, we have four options: We don’t use any compression at all (although, as mentioned, on modern Teradata systems, BLC is always enabled, and this option is therefore not available). We use only MVC (again, the same applies), only BLC, or both in combination. To experience the impact on resource consumption (CPU seconds for packing and unpacking, disk IOs) and space consumption, we have built a small test environment below:

For MVC, the columns or compression were chosen so that practically everything is compressed away.

Below we can see our test setup consisting of four tables:

SET QUERY_BAND = 'BLOCKCOMPRESSION=NO;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_MVC
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100) COMPRESS ('PLATIN','GOLDEN','SILVER','BRONZE‘)
)
PRIMARY INDEX ( CustomerId );
SET QUERY_BAND = 'BLOCKCOMPRESSION=NO;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_NONE
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100)
)
PRIMARY INDEX ( CustomerId );
SET QUERY_BAND = 'BLOCKCOMPRESSION=YES;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_BLC_MVC
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100) COMPRESS ('PLATIN','GOLDEN','SILVER','BRONZE‘) 
)
PRIMARY INDEX ( CustomerId );
SET QUERY_BAND = 'BLOCKCOMPRESSION=YES;' FOR SESSION;
CREATE SET TABLE DWHPRO.Customer_BLC
(
 CustomerId BIGINT NOT NULL,
 CategoryCd CHAR(100)
)
PRIMARY INDEX ( CustomerId );

Let’s get right to the point. We inserted 10 million rows in each table to make the result significant. The column CategoryCd contains only the following four values: ‘PLATIN’, ‘GOLDEN’, ‘SILVER’, ‘BRONZE’.

Comparing the Disk IOs

Note: The disk IOs are caused by a full table scan. No index is used:

Teradata BLC
IOs needed for each combination of compression

If we consider only the disk IOs, then we can conclude the following from the result:

  • Not using compression is always wrong.
  • We should combine block-level compression with MVC

Again, we want to point out that our test optimized the setup for MVC so that practically everything could be compressed. In a real environment, this may not be the case.

Comparing the CPU Seconds

Next, we examine how the different compression methods affect the CPU seconds consumed.

teradata MVC
Adding BLC to the optimized MVC does not benefit IOs but increases the used CPU seconds.

The following picture emerges here:

  • Not using compression is always wrong.
  • Block-level compression increases the CPU seconds needed and should only be used for cold data.

Comparing the Space Usage

Teradata MVC, BLC, Or Both? 1

If we consider only the saving of permanent space, then the following picture emerges:

  • Not using compression is always wrong.
  • MVC and BLC together have about the same space savings as BLC alone.
  • Even the highly optimized MVC setup we chose does not achieve the space savings on its own as BLC alone or a combination of both.

Summary

Even if the results seem straightforward, we should take them with a grain of salt. In the end, we created a scenario that was optimal for MVC. We will often not be able to achieve this in reality. If we assume that BLC is always enabled on most systems, the advantage of additional MVC is limited. It must also be remembered that there are hardly any suitable tools for automated MVC, and the cost of a consultant hour is no longer in proportion to the price of mass storage. I think the times when a Teradata consultant could spend weeks on MVC to save expensive disk space are over.

It looks different here if you examine the whole thing from the CPU side. If we deal with a CPU-bound system, MVC can make sense to bring load off the machine.

The difference in disk IOs is slight if BLC is mandatory enabled, and therefore no argument for MVC.

My conclusion: MVC only if CPU usage is a problem. We can spend our time more wisely than looking for MVC candidates.

https://www.dwhpro.com/teradata-multiset-table/
  • Brilliant article, Roland.
    Thanks for sharing the comparison metrics.
    Whenever I have a doubt, I just hop on to your blog. 🙂

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >