Teradata Compression: Should you bother?

4
785

 

The Teradata DBMS has always supported compression in a (growing) number of forms.

Version 5.0 introduced Multi-Value Compression (MVC), version 13.10 introduced Algorithmic (ALC) and Block Level (BLC) compression, and version 14.0 introduced multiple automatic compression options for Columnar tables.

All of these can produce disk space and associated performance savings; however implementing Teradata compression with MVC or ALC takes man-time, which has a cost associated with it.

It also needs to be remembered that because data demographics change over time, analysis of either of these types of compression needs to be revisited on a regular basis (which may only be once a year).

Yes, there are tools which will help with this, but the point of this article is real to ask if the time you are spending on compression is worthwhile? Is it giving you benefits that are commensurate with the costs? Like many things, this is a question of “cost vs. benefit”.

On one recent customer engagement I produced the following analysis for them:

  • They have about 98000 tables.
  • About 26000 tables have MVC implemented.
  • On their system, only 4000 tables are big enough that a full table scan is estimated to take longer than 1 second – even allowing for a reduced scan rate because of concurrent queries.

I am not recommending that you remove existing MVC compression on small tables. The use of MVC doesn’t impose any real CPU overhead, and removing compression on a lot of tables will be a lot of effort for negligible if any gain. If you have ALC applied to a small table, then you might get some savings.

What I would recommend is that when designing new tables or possibly when making changes to existing tables that the compression setting should be carefully considered.

If that table is likely to be small, then I would consider not applying any compression to it. This is more change in a process than any particular remedial action that needs to be taken.

Our Reader Score
[Total: 9    Average: 3.9/5]
Teradata Compression: Should you bother? written by David Wellman average rating 3.9/5 - 9 user ratings

4 COMMENTS

  1. I am not sure if anyone has come across this issue and if there might be a simple solution. In the Teradata site I am working at, they have not been able to use MVC as it impacts on the ability of Netbackup to do deduplication.
    Currently they run with Tara and a Netbackup plugin and they achieve a fair amount of deduplication in the backups. I have been advised that when they attempt to add MVC to the Teradata tables, that the deduplication fails on Netbackup and so they have not implemented it.
    I believe that they can save space, and a system upgrade by implementing MVC, but it would appear at a cost of SAN storage.

  2. Hi Nathan,
    Thanks for your comments. I think your comments actually re-enforce the point that I’m trying to make.
    You’re absolutely right that a lot of the time small tables get joined to big tables, but precisely because they are small, compression won’t give you much benefit. If a table starts with a size of at 1MB you can only reduce the size by 1MB, and if on a given system a 1MB table can be scanned in under 1 second then you can only save 1 second processing on a query.
    Given that (without using any tools) this analysis takes a lot of time and effort – which has a cost – are the savings worth the cost? That is really the point of my article.
    Cheers,
    Dave

  3. David,

    One thing to consider with small table compression, is that small tables are (very) often joined to big tables, resulting in large intermediate result sets in spool. Multi Value Compression is preserved in spool, so performing MVC on columns in small tables will give you benefits in reducing spool size, with the associated performance gains from reduced IO.

    I certainly agree with your advice, choice of compression should be carefully considered. It’s worth exploring the degree of benefit from aspects of “most used” as well as size. Ideally perform a benchmark with some real data and queries, to provide the facts and figures to make an informed decision.

    @Roland: In my personal opinion Teradata doesn’t make a compression utility because our software partners, such as Ward Analytics and Atanasoft, make excellent tools for analysing and implementing compression. If you have a look in Teradata developer exchange (http://developer.teradata.com/), you can find articles discussing how to do compression analysis manually, which you can easily adapt and automate for your own use.

    Regards,

    Nathan Green.
    Senior Ecosystem Architect
    Teradata Pty Ltd.

  4. Hi David. The problem is that Teradata (the vendor) is not offering any tool for compression (for me it’s obvious why). Doing compression manually is very costly but still a frequently applied strategy. As compression is an optimization problem over a punch of variables i think the best way to go is the usage of 3rd party tools.

LEAVE A REPLY

Please enter your comment!
Please enter your name here