Despite implementing Multivalue Compression to minimize the size of our tables, we still require additional space.

Shrinking A Teradata Table To A Minimum Size

I once used a trick that reduced an already optimized table with multivalue compression to just 10% of its original size.

This unique situation is worth monitoring for potential opportunities.

I will illustrate using the table below:

CREATE SET TABLE DWHPRO.Sales
(
      SalesDate DATE FORMAT 'YY/MM/DD' NOT NULL,
      Customer_Category_Cd CHAR(5) NOT NULL COMPRESS ('10000','20000'),
      SalesId CHAR(50) NOT NULL,
      Sales_Main_Type_Cd CHAR(1) NOT NULL COMPRESS ('A','B','C','D','E'),
      Sales_Sub_Type_Cd CHAR(5) NOT NULL COMPRESS '00001','00002','00003'),
      Sales_Value_Cd CHAR(5) NOT NULL COMPRESS ('VAL01','VAL02','VAL03'),
      Sales_Class_Cd CHAR(5) NOT NULL COMPRESS ('Cla01','Cla02'),
      Group_SalesId CHAR(50) NOT NULL COMPRESS (''),
      Currency CHAR(3) NOT NULL COMPRESS ('EUR','USD','AUD','NZD'),
      Final_Flag INTEGER NOT NULL COMPRESS 0 ,
      Price DECIMAL(38,2) COMPRESS (0.00, 100.00,200.00)
PRIMARY INDEX (SalesId)
PARTITION BY RANGE_N(
SalesDate BETWEEN DATE '2019-01-01' AND DATE '2020-12-31' EACH INTERVAL '1' DAY,NO RANGE, UNKNOWN)
UNIQUE INDEX (SalesDate,Customer_Category_Cd,SalesId,Sales_Main_Type_Cd ,Sales_Sub_Type_Cd,Sales_Value_Cd,Sales_Class_Cd ,Group_SalesId ,Currency ,Final_Flag );

SalesId is the primary index that distributes efficiently and is frequently utilized for joining.

The Primary Key was duplicated using a Unique Secondary Index (USI).

The concept aims to inhibit the insertion of duplicate primary keys into the table.

Initially, this solution appears legitimate, but it ultimately proves fatal. I will explain the reason shortly. First, let’s consider the table’s size:

SELECT SUM(CURRENTPERM)  / (1024**3)  AS Size_GB FROM DBC.TABLESIZE 
WHERE TABLENAME = 'Sales';
-> 125,50 GB

Next, we must determine the table’s size after removing the USI.

DROP UNIQUE INDEX (SalesDate,Customer_Category_Cd,SalesId ,Sales_Main_Type_Cd,Sales_Sub_Type_Cd,Sales_Value_Cd,
Sales_Class_Cd ,Group_SalesId ,Currency ,Final_Flag) 
ON DWHPRO.Sales; 
SELECT SUM(CURRENTPERM)  / (1024**3)  AS Size_GB FROM DBC.TABLESIZE 
WHERE TABLENAME = 'Sales';
-> 12,50 GB
teradata primary key
How can the table be only a tenth of the size?

The Reason For A Huge Table Size

The USI is uncompressed and essentially duplicates the base table due to the inclusion of nearly all columns.

Teradata Primary Key Check Solution

In my opinion, creating secondary indexes solely to enforce the primary key is a flawed practice.

Usually, it is more logical to perform this task programmatically during the ETL process. This approach also eliminates the need for maintaining the secondary index, which can be resource-intensive.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>