fbpx

Often we have already reduced the size of our tables immensely using Multivalue Compression, and yet we need more space.

Shrinking A Teradata Table To A Minimum Size

I used once a simple trick for you that has allowed me to reduce a table already optimized with multivalue compression to 10% of its size.

This is a special scenario, but it is worth to keep an eye on such setups.

As an example I use the following table:

CREATE SET TABLE DWHPRO.Sales
(
      SalesDate DATE FORMAT 'YY/MM/DD' NOT NULL,
      Customer_Category_Cd CHAR(5) NOT NULL COMPRESS ('10000','20000'),
      SalesId CHAR(50) NOT NULL,
      Sales_Main_Type_Cd CHAR(1) NOT NULL COMPRESS ('A','B','C','D','E'),
      Sales_Sub_Type_Cd CHAR(5) NOT NULL COMPRESS '00001','00002','00003'),
      Sales_Value_Cd CHAR(5) NOT NULL COMPRESS ('VAL01','VAL02','VAL03'),
      Sales_Class_Cd CHAR(5) NOT NULL COMPRESS ('Cla01','Cla02'),
      Group_SalesId CHAR(50) NOT NULL COMPRESS (''),
      Currency CHAR(3) NOT NULL COMPRESS ('EUR','USD','AUD','NZD'),
      Final_Flag INTEGER NOT NULL COMPRESS 0 ,
      Price DECIMAL(38,2) COMPRESS (0.00, 100.00,200.00)
PRIMARY INDEX (SalesId)
PARTITION BY RANGE_N(
SalesDate BETWEEN DATE '2019-01-01' AND DATE '2020-12-31' EACH INTERVAL '1' DAY,NO RANGE, UNKNOWN)
UNIQUE INDEX (SalesDate,Customer_Category_Cd,SalesId,Sales_Main_Type_Cd ,Sales_Sub_Type_Cd,Sales_Value_Cd,Sales_Class_Cd ,Group_SalesId ,Currency ,Final_Flag );

As you can see, SalesId is the primary index. This index distributes well and is also often used for joining.

The Primary Key was replicated via a USI.

The idea behind it is to prevent primary key duplicates from being inserted into the table.

This looks like a legitimate solution initially, but it turns out to be a fatal error. The reason I will explain to you in a moment. But first, let’s have a look at the size of the table:

SELECT SUM(CURRENTPERM)  / (1024**3)  AS Size_GB FROM DBC.TABLESIZE 
WHERE TABLENAME = 'Sales';
-> 125,50 GB

The next step is to check how big the table is when we drop the USI:

DROP UNIQUE INDEX (SalesDate,Customer_Category_Cd,SalesId ,Sales_Main_Type_Cd,Sales_Sub_Type_Cd,Sales_Value_Cd,
Sales_Class_Cd ,Group_SalesId ,Currency ,Final_Flag) 
ON DWHPRO.Sales; 
SELECT SUM(CURRENTPERM)  / (1024**3)  AS Size_GB FROM DBC.TABLESIZE 
WHERE TABLENAME = 'Sales';
-> 12,50 GB
teradata primary key
How can it be that the table is only a tenth of the size?

The Reason For A Huge Table Size

A USI is not compressed. Since the USI of our example includes almost all columns of the table, it is like an uncompressed copy of the base table.

Teradata Primary Key Check Solution

I always see it as a bad practice that secondary indexes are only created to force the primary key.

Most of the time, it makes more sense to do this programmatically once in the ETL process. This also has the advantage that the secondary index’s maintenance is omitted, which consumes additional resources.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"rgb(255, 0, 0)"}},"gradients":[]}}]}__CONFIG_colors_palette__
GET OUR TERADATA BOOK

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>