Maximizing Performance with Teradata Partitioning Strategies

Roland Wenzlofsky

April 20, 2023

minutes reading time

Teradata Partitioning and Performance

This article explores various strategies for utilizing Teradata Partitioning to optimize query efficiency. The primary objective of these strategies is to minimize resource utilization. Answering the following questions will aid in determining the optimal approach for implementing row-level partitioning.

Considerations to be made before Partitioning a Table

How big is the Table?

Huge tables are suitable for row-level partitioning. For the Optimizer, selecting individual partitions on small tables is not worthwhile, and a full table scan is often just as efficient.

Nevertheless, partitioning a small table identically to a large table can improve performance for joins.

How often is a Column used in WHERE Conditions?

A column is more suitable as a partition column if WHERE conditions are frequently applied to this column, resulting in only a fraction of the table rows being read.

The opposite statement is also true:
If there is no column after which a WHERE condition is frequently restricted or the selectivity of the WHERE conditions is low, it makes no sense to use Teradata Partitioning.

Should the Partition Column be part of the Primary Index?

The columns should always be part of the primary index if this does not result in other performance disadvantages.

The advantage when partition columns are in the primary index: Each primary index value can only be present in exactly one partition. Therefore the query is more performant as no partition probing is needed to locate the rows.

Here is an example where it is a disadvantage to include the partition column in the primary index:

) PRIMARY INDEX (SalesId,SalesDate)
PARTITION BY RANGE_N ( SalesDate BETWEEN DATE ‘2019-01-01’ AND ‘2019-12-31’ EACH INTERVAL ‘1’ DAY );

The following kind of tactical query is mainly executed:

SELECT * FROM Sales WHERE SalesId = <a particular SalesId>;

In this case, the Table cannot be queried using the primary index. Therefore, SalesDate should not be in the primary index.teradata ppi

Does it make any difference how many Partitions we define?

Yes, because the more partitions there are, the worse the performance will be if we join with an NPPI table or a PPI table partitioned differently. Partition probing can also become an issue if the partition columns are not part of the Primary Index.

For example: If we always query whole months, we should not create daily partitions.

Can Teradata handle all Types of Partitioning equally well, i.e., does the Optimizer recognize Partition Elimination Opportunities?

Although the Optimizer gets better at it, simple partition expressions are preferable—the more complex a partition expression, the greater the risk that the Optimizer overlooks partition elimination opportunities.

Should all Teradata Partitions always contain approximately the same number of Rows?

No, that depends on the application. See the next section for details.

Teradata Partitioning Strategies

  • Each partition has about the same number of rows
  • Some Partitions have a small number of rows, while other partitions have a large number of rows:
    For example, it makes sense to partition the part of the requested data more often (more partitions with fewer rows).
  • All partitions have the same number of different values, but the number of rows per partition can differ.

Please evaluate when to utilize the appropriate Teradata Partitioning strategy. Feel free to share your thoughts in the comments section, which can benefit other readers.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like