Teradata Partitioning: Row Vs. Column For Efficient Data Access

Teradata Partitioning – Row vs. Column

Partitioning tables using column or row partitioning can reduce resource usage when accessing data.

Row partitioning horizontally partitions tables, allowing the optimizer to limit data access to a cluster of rows. On the other hand, column partitioning vertically partitions tables, enabling the optimizer to restrict data access to a subset of columns.

The rows of partitioned tables are stored in physical clusters, which are groups of rows. For instance, a cluster may contain all the rows from a specific date.

If a cluster is utilized in a query condition, the Teradata optimizer will generate an execution plan that restricts access solely to the partitions with the relevant date. This eliminates the need for a full table scan (FTS).

Partitioned tables and non-partitioned tables are stored similarly, with rows distributed across all AMPs and organized by partition number, row hash value, and uniqueness value. In fact, non-partitioned tables are stored internally like partitioned tables, with all rows belonging to the dummy partition 0.

Row-partitioned tables store each row in a single physical row on a disk.

Column partitioning clusters one or more table columns to create partitions. Each partition represents a cluster. Physical rows within column-partitioned tables contain a subset of columns and do not represent complete logical rows.

Row partition clustering allows for complex expressions to be utilized, whereas column partitioning is restricted to the columns’ definition and creates a single cluster. Therefore, column partitioning is more straightforward to implement.

Row-partitioned tables offer the advantage of partition elimination, which allows the optimizer to limit search activity to the accessed partition when a query selects only one partition. This eliminates the need for a full table scan.

Column partitioned tables offer the advantage of selecting a subset of columns, thereby decreasing disk IO.

Combining column and row partitioning in a single table can enhance resource utilization.

Here is an example analyzing four potential table configurations: without partitioning, with row partitioning, with column partitioning, and with both row and column partitioning.

CREATE SET TABLE THE_TABLE (COL_A INTEGER, COL_B INTEGER, COL_C INTEGER, COL_D INTEGER, COL_E INTEGER, COL_F INTEGER);

We are executing this query:

SELECT SUM(COL_A) FROM mytable WHERE COL_C IN (2,3,4);

Row partitioning will be implemented on the column COL_C, assuming that column containers are available for COL_A and COL_C.

In the case of a column-partitioned table, all five rows but only two columns have to be accessed.
All five rows and columns must be accessed in the non-partitioned table.
All columns, but only three rows, must be accessed in a row-partitioned table.
Only three rows and two columns must be accessed in a row and column-partitioned table.

Choosing the right partitioning can optimize resource utilization. However, it is crucial to understand your workload, as each partitioning approach has distinct pros and cons based on the type of workload. To learn more, please refer to our article on the Teradata columnar feature.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

Teradata Partitioning: Row vs. Column for Efficient Data Access

Teradata Partitioning – Row vs. Column

📊 Data Platform Migration Survey

Stay Ahead in Data Warehousing

Leave a Comment Cancel reply