Is Teradata Compression Worth Your Time and Effort?

David Wellman

April 28, 2023

minutes reading time


Teradata DBMS has consistently supported compression in various forms.

Multi-Value Compression (MVC) was introduced in version 5.0, Algorithmic (ALC) and Block Level (BLC) compression in version 13.10, and version 14.0 introduced various automatic compression options for Columnar tables.

Implementing Teradata compression with MVC or ALC can lead to savings in disk space and improved performance. However, it requires a significant investment in manpower, which can be expensive.

It is important to remember that data demographics change over time, necessitating regular revisiting of compression analysis, which may occur annually.

Are the benefits of dedicating time to compression worth the costs? While certain tools may assist with compression, the purpose of this article is to raise this question. As with most matters, it boils down to a “cost versus benefit” analysis.

[su_shadow][su_panel]On one recent customer engagement, I produced the following analysis for them:

  • They have about 98000 tables.
  • About 26000 tables have MVC implemented.
  • On their system, only 4000 tables are big enough that a full table scan is estimated to take longer than 1 second – even allowing for a reduced scan rate because of concurrent queries.[/su_panel][/su_shadow]

It is unnecessary to remove the current MVC compression for small tables. MVC compression does not have any CPU overhead, and the negligible benefits of eliminating compression on multiple tables are not worth the effort. However, some savings may be achievable if ALC is used on a small table.

When designing or modifying tables, carefully considering the compression setting is advisable.

If the table is small, compression may not be needed. This is a matter of process alteration rather than remediation.

David Wellman

David Wellman is the Technical Director of Ward Analytics Ltd, a UK company specialising in Teradata performance analysis and management. David has been using the Teradata products for over 20 years and regularly delivers Teradata training courses and performance consulting engagements. David is a Teradata Certified Master at V2R3, V2R5, TD12 and TD14.

More details at: http://www.ward-analytics.com

  • Avatar
    YoungPelican says:

    I am not sure if anyone has come across this issue and if there might be a simple solution. In the Teradata site, I am working at, they have not been able to use MVC as it impacts on the ability of Netbackup to do deduplication.
    Currently, they run with Tara and a Netbackup plugin and they achieve a fair amount of deduplication in the backups. I have been advised that when they attempt to add MVC to the Teradata tables, the deduplication fails on Netbackup and so they have not implemented it.
    I believe that they can save space, and a system upgrade by implementing MVC, but it would appear at a cost of SAN storage.

  • Hi Nathan,
    Thanks for your comments. I think your comments actually re-enforce the point that I’m trying to make.
    You’re absolutely right that a lot of the time small tables get joined to big tables, but precisely because they are small, compression won’t give you much benefit. If a table starts with a size of 1MB you can only reduce the size by 1MB, and if on a given system a 1MB table can be scanned in under 1 second then you can only save 1-second processing on a query.
    Given that (without using any tools) this analysis takes a lot of time and effort – which has a cost – are the savings worth the cost? That is really the point of my article.
    Cheers,
    Dave

  • Avatar
    Nathan Green says:

    David,

    One thing to consider with small table compression, is that small tables are (very) often joined to big tables, resulting in large intermediate result sets in the spool. Multi-Value Compression is preserved in the spool, so performing MVC on columns in small tables will give you benefits in reducing spool size, with the associated performance gains from reduced IO.

    I certainly agree with your advice, the choice of compression should be carefully considered. It’s worth exploring the degree of benefit from aspects of “most used” as well as size. Ideally perform a benchmark with some real data and queries, to provide the facts and figures to make an informed decision.

    @Roland: In my personal opinion Teradata doesn’t make a compression utility because our software partners, such as Ward Analytics and Atanasoft, make excellent tools for analyzing and implementing compression. If you have a look at the Teradata developer exchange (https://developer.teradata.com/), you can find articles discussing how to do compression analysis manually, which you can easily adapt and automate for your own use.

    Regards,

    Nathan Green.
    Senior Ecosystem Architect
    Teradata Pty Ltd.

  • Hi David. The problem is that Teradata (the vendor) is not offering any tool for compression (for me it’s obvious why). Doing compression manually is very costly but still a frequently applied strategy. As compression is an optimization problem over a punch of variables I think the best way to go is the usage of 3rd party tools.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >