fbpx

Teradata Compression: Should You Bother?

By David Wellman

April 11, 2015


The Teradata DBMS has always supported compression in a (growing) number of forms.

Version 5.0 introduced Multi-Value Compression (MVC), version 13.10 introduced Algorithmic (ALC) and Block Level (BLC) compression, and version 14.0 introduced multiple automatic compression options for Columnar tables.

These can produce disk space and associated performance savings; however, implementing Teradata compression with MVC or ALC takes man-time, which has a cost associated with it.

It also needs to be remembered that because data demographics change over time, analysis of either of these types of compression needs to be revisited regularly (which may only be once a year).

Yes, some tools will help with this, but the point of this article is really to ask if the time you are spending on compression is worthwhile? Is it giving you benefits that are commensurate with the costs? Like many things, this is a question of “cost vs. benefit”.

On one recent customer engagement, I produced the following analysis for them:

  • They have about 98000 tables.
  • About 26000 tables have MVC implemented.
  • On their system, only 4000 tables are big enough that a full table scan is estimated to take longer than 1 second – even allowing for a reduced scan rate because of concurrent queries.

I am not recommending that you remove existing MVC compression on small tables. The use of MVC doesn’t impose any real CPU overhead, and eliminating compression on many tables will be a lot of effort for negligible if any gain. If you have ALC applied to a small table, then you might get some savings.

I would recommend that when designing new tables or possibly when making changes to existing tables, the compression setting should be carefully considered.

If that table is likely to be small, I would consider not applying any compression. This is more change in a process than any particular remedial action that needs to be taken.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"rgb(255, 0, 0)"}},"gradients":[]}}]}__CONFIG_colors_palette__
GET OUR TERADATA BOOK

David Wellman

David Wellman is the Technical Director of Ward Analytics Ltd, a UK company specialising in Teradata performance analysis and management. David has been using the Teradata products for over 20 years and regularly delivers Teradata training courses and performance consulting engagements. David is a Teradata Certified Master at V2R3, V2R5, TD12 and TD14.More details at: http://www.ward-analytics.com

  • Avatar
    YoungPelican says:

    I am not sure if anyone has come across this issue and if there might be a simple solution. In the Teradata site, I am working at, they have not been able to use MVC as it impacts on the ability of Netbackup to do deduplication.
    Currently, they run with Tara and a Netbackup plugin and they achieve a fair amount of deduplication in the backups. I have been advised that when they attempt to add MVC to the Teradata tables, the deduplication fails on Netbackup and so they have not implemented it.
    I believe that they can save space, and a system upgrade by implementing MVC, but it would appear at a cost of SAN storage.

  • Hi Nathan,
    Thanks for your comments. I think your comments actually re-enforce the point that I’m trying to make.
    You’re absolutely right that a lot of the time small tables get joined to big tables, but precisely because they are small, compression won’t give you much benefit. If a table starts with a size of 1MB you can only reduce the size by 1MB, and if on a given system a 1MB table can be scanned in under 1 second then you can only save 1-second processing on a query.
    Given that (without using any tools) this analysis takes a lot of time and effort – which has a cost – are the savings worth the cost? That is really the point of my article.
    Cheers,
    Dave

  • David,

    One thing to consider with small table compression, is that small tables are (very) often joined to big tables, resulting in large intermediate result sets in the spool. Multi-Value Compression is preserved in the spool, so performing MVC on columns in small tables will give you benefits in reducing spool size, with the associated performance gains from reduced IO.

    I certainly agree with your advice, the choice of compression should be carefully considered. It’s worth exploring the degree of benefit from aspects of “most used” as well as size. Ideally perform a benchmark with some real data and queries, to provide the facts and figures to make an informed decision.

    @Roland: In my personal opinion Teradata doesn’t make a compression utility because our software partners, such as Ward Analytics and Atanasoft, make excellent tools for analyzing and implementing compression. If you have a look at Teradata developer exchange (http://developer.teradata.com/), you can find articles discussing how to do compression analysis manually, which you can easily adapt and automate for your own use.

    Regards,

    Nathan Green.
    Senior Ecosystem Architect
    Teradata Pty Ltd.

  • Hi David. The problem is that Teradata (the vendor) is not offering any tool for compression (for me it’s obvious why). Doing compression manually is very costly but still a frequently applied strategy. As compression is an optimization problem over a punch of variables I think the best way to go is the usage of 3rd party tools.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >