What is Block Level Compression (BLC) and how does it work?

Block Level Compression (BLC) is a feature which was introduced with Teradata 13.10 and allows for compression of entire data blocks, contrary to other compression methods which are applied on column level (MVC, ALC).

Only the data blocks containing rows are compressed. No compression is done on the table headers, the master index, the cylinder index and the WAL (write-ahead log). Block-level compression is thought to be used for permanent tables but can be switched on as well for spool space, temporary space, the permanent journal, and the works space (used for sorting operations).

Whenever you decide to apply block level compression on a table, all entire data blocks are being compressed and stored on the hard disks (or any other mass storage device such as an SSD), leading to a reduction in disk space.

There is a key advantage for tables with column level compression, which is the reduction in disk IOs, as compression will be preserved in the spool tables during query processing. Be aware. That this is not valid for block level compression, therefore it has to be stated clearly:

Block Level Compression is not a performance optimization feature.

Whenever a data block is read, the entire data block has to be moved from the disk to the FSG cache, and there it will be uncompressed. Moving data blocks is a costly operation, but even worse, uncompressed data blocks (even if cached) cannot be reused by any other or even the same session.

If a session has to use a data block for the second time, the data block has to be decompressed again (even if already in FSG cache). Similarly, when a row has to be changed, the data block has to be moved into the FSG cache and uncompressed, before the change can happen. The change will be done, and the data block has to be compressed again and written back to the disk.

BLC on Fallback protected tables will also compress the data blocks related to fallback protection sub-table. There is no option available which allows compressing only one of them (primary table or fallback subtable).

Block Level Compression is an excellent method to cut disk space usage, for tables which are not frequently accessed, but is costly (CPU time) for tables where frequent changes take place (of course an extra one-time cost is involved for the first compression of all data blocks).

Efficiency of Block Level Compression

Individual data blocks within a table will meet different compression rates depending on the data demographics of the rows contained in each data blocks (Teradata uses the Lempel-Ziv algorithm for compression).

Teradata will not compress data blocks being considered to be too small for compression.

In case you apply column-level compression (MVC, ALC), together with BLC. Obviously, the more benefit from BLC is less than if you would use only BLC on this table. Still, BLC will further decrease used disk space for this table.

Secondary index subtables (NUSI and USI) will not be block level compressed. Keep this in mind for tables with many secondary indexes, as space reduction may stay well behind your expectations. Nevertheless, any existing join indexes will be compressed.

Teradata 14.00 – Conditional Block Level Compression

As we elaborated above, BLC has to be applied very carefully, especially on CPU bound systems (compression/decompression is a CPU intensive task).

In order improve the usability of Block Level Compression, starting with Teradata 14.00, temperature based BLC was introduced.  Temperature based BLC can be applied to all tables at once or to individual tables.

The task of classifying data by temperature (cold, warm, hot) is carried out by the Teradata Virtual Storage (TVS) System. Data temperature defines how often certain data is accessed. Technically spoken, TVS creates a classification on cylinder level (not block level) i.e. a map about cold, warm and hot cylinders. While cold data is seldom used, hot data defines the most often accessed data.

Which data should be compressed is identified by the DBS control setting TempBLCTresh which can be configured to COLD, WARM and HOT with the following meaning:

TempBLCTresh = {COLD|WARM|HOT},0..100

Parameter 1 (COLD|WARM|HOT):
COLD: compress only cold data. TVS sorts all permanent disk space cylinders by temperature and considers the 20% least accessed cylinders as cold.
WARM: compress cold and warm blocks
HOT: compress all data blocks (but in this case you should not use temperature based BLC)

Parameter 2 (0..100):

With this value, you can define the boundary between compressed and uncompressed data. For example, 30 would mean that the least accessed 30% of the data should be compressed. By default, this would mean all cold data (20%) and 10% of the warm data will be compressed.

When to use Teradata Block Level Compression (BLC)

Unquestionable, BLC allows for the highest reduction in disk space, but at the same time is the most costly type of compression, about resource usage (CPU). It is your task to find the best balance between disk space and resource usage.

On CPU bound systems BLC will not be a good choice. You should primarily try to meet better compression rates with Multivalue Compression (MVC) and Algorithmic Compression (ALC).

BLC should be considered as a good option on a system with large and seldom accessed tables. Furthermore, you can view BLC on systems with a CPU use less than 60-70%.

 

Our Reader Score
[Total: 13    Average: 4.5/5]
Teradata Block Level Compression – BLC written by Roland Wenzlofsky on October 13, 2014 average rating 4.5/5 - 13 user ratings

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here