Get a Grip on clumpy Row Distribution – the Teradata PPI Table.
The partition column(s) must not be part of the Teradata Primary Index for this trick to function correctly.
We all experienced this problem with table inserts at some point in time: Although the table rows are distributed evenly across all AMPs, the merge step into the table seems to take forever.
One likely reason is that we are confronted with the scenario of a SET table and many identical NUPI values. The AMPs must do a duplicate row check for each inserted row. This process can take enormous time and waste a lot of CPU cycles as each new row is compared in full against all the rows with the same Primary Index value.
While we may solve it by switching over to a MULTISET design in the case of nonpartitioned tables, we have another excellent opportunity to address this problem with Teradata partitioned tables (PPI tables).
Assume we have the following table design like below:
CREATE TABLE MultiLevel_Example
(
the_code INTEGER NOT NULL,
the_date DATE NOT NULL,
the_value DECIMAL(18,0)
) UNIQUE PRIMARY INDEX (the_code)
PARTITION BY (
RANGE_N(the_date BETWEEN DATE ‘2015-01-01‘ AND DATE ‘2015-12-31’ EACH INTERVAL ‘1’ MONTH), RANGE_N(the_code BETWEEN 1 AND 5 EACH 1)
);
As you can see, the partition columns are not part of the Primary Index – this is the prerequisite for the trick to work!
Below is the schematic draw of the internal presentation of the table:

The column labeled “C” is the combined partition, “L1” is the first level of partitioning, and “L2” is the second tier. The column labeled “H” represents the ROWHASH of “The_Code.”
“Whenever we create a PPI table that does not contain all partitioning columns in the Primary Index definition, equal NUPI values are spread across different partitions, reducing the clumps and improving performance as the number of rows to be compared for each duplicate row check becomes significantly smaller.”
Nevertheless, be aware that having the partition columns not in the Primary Index comes as well with a penalty, and you have to evaluate the advantages and disadvantages:
In the case of primary index access, all partitions must be checked to find the searched Primary Index value. Furthermore, merge joins may be executed slower, and aggregations are more costly as they occur across partitions.
Let us take a brief look at why this trick will not work for a Teradata PPI table in case all partition columns are contained in the Primary Index definition:
CREATE TABLE MultiLevel_Example
(
the_code INTEGER NOT NULL,
the_date DATE NOT NULL,
the_value DECIMAL(18,0)
) UNIQUE PRIMARY INDEX (the_code,the_date)
PARTITION BY (
RANGE_N(the_date BETWEEN DATE ‘2015-01-01‘ AND DATE ‘2015-12-31’ EACH INTERVAL ‘1’ MONTH), RANGE_N(the_code BETWEEN 1 AND 5 EACH 1)
);
Take a look at the below illustration. As The ROWHASH is calculated over all partition columns, it is by design impossible that the same ROWHASH ends up in different partitions (take your time to prove this):

Here is a link to the Partitioned Primary Index Guide:
While performing a huge java transaction with insertion queries in batches.Getting AMP lock table has been overflowed SQL Exception .how to solve it? Do any settings need to be changed in DBS control ?