Maximizing Efficiency: Teradata vs. Snowflake Partitioning

DWH Pro Admin

April 20, 2023

minutes reading time

Minimizing I/O in Teradata and Snowflake

All database systems aim to minimize physical I/O, as it remains the slowest component in query execution—even on SSD or cloud storage.

The Teradata Approach

Teradata distributes rows across AMPs based on the Primary Index, ensuring parallel access and load balancing.
Row Partitioning further reduces I/O by physically organizing rows by a partition expression—commonly date, region, or status—allowing the optimizer to read only the necessary partitions.

Choosing a partition expression aligned with common WHERE clauses or joins is critical.
However, partition elimination in Teradata applies only when predicates reference the partition columns.
To achieve pruning on multiple dimensions (e.g., date and customer_id), the table must be redefined with a multi-level partitioning scheme.

While Teradata supports Secondary Indexes and Join Indexes, these increase storage and DML overhead and are rarely used for high-volume tables.

The Snowflake Approach

Snowflake divides data into immutable micro-partitions (~16 MB), each storing columnar data plus metadata such as min/max values, null counts, and distinct counts.
This metadata allows automatic partition pruning—no explicit partition key required.

Snowflake’s advantage is that any column can benefit from pruning if its value range helps identify relevant micro-partitions.
In contrast, Teradata requires the partitioning columns to appear in the predicate.

However, Snowflake’s pruning efficiency depends on data clustering at load time.
If data is loaded in natural order (e.g., by date), pruning remains efficient.
If inserted randomly, performance can degrade until reclustering occurs.

Snowflake’s Clustering Service can automatically reorganize micro-partitions based on a defined cluster key, improving pruning but consuming compute resources.

Key Differences and Migration Guidelines

Concept	Teradata	Snowflake	Migration Consideration
Partition Definition	Explicit (requires DDL)	Implicit (metadata-driven)	Simplify schema; no partition keys needed
Multi-column Pruning	Requires multi-level partitions	Automatic on any column	Redesign logic; fewer partition schemes
Physical Locality	Row-based, cylinder-oriented	Cloud object storage	Load data sorted by clustering dimension
Maintenance	Manual partition design	Auto pruning, optional reclustering	Use Clustering only for large, stable tables

Migration Guidelines: Teradata → Snowflake

Validate performance with SYSTEM$CLUSTERING_DEPTH and query profile partition stats.

Remove explicit ROW PARTITION BY clauses — Snowflake manages partitions automatically.

If predicates often filter by date or region, consider a CLUSTER BY (date, region) definition.

Avoid frequent DML on large clustered tables. Instead, use staging + MERGE patterns.

Take a brief survey and receive the book for free Get instant access

Share0

Tweet0

Share0

DWH Pro Admin

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

Maximizing Efficiency: Teradata vs. Snowflake Partitioning

Minimizing I/O in Teradata and Snowflake

The Teradata Approach

The Snowflake Approach

Key Differences and Migration Guidelines

DWH Pro Admin

Teradata’s Semicolon Optimization vs. Snowflake’s Architecture — Two Worlds, One Goal

Understanding Skew in Teradata and Snowflake

Why the Same UPDATE Means Something Entirely Different in Teradata and Snowflake

Teradata Hybrid: Bridge or Destination?