Performance degradation caused by uneven workload distribution is one of the oldest and most persistent challenges in parallel data warehouse systems. Both Teradata and Snowflake can experience this imbalance, commonly known as skew.
Although the term is shared, the mechanics differ fundamentally: Teradata can suffer from both persistent and runtime skew, whereas Snowflake’s skew occurs only during query execution.
Skew in Teradata
Teradata distributes data across parallel units called Access Module Processors (AMPs). Each row is assigned to an AMP based on the hash of its Primary Index (PI). This hashing scheme enables Teradata’s shared-nothing architecture and massive parallelism.
When PI values are evenly distributed, AMPs handle similar workloads. If a few AMPs receive disproportionately more rows or processing effort, skew occurs.
Data (Storage) Skew
Data skew stems from how rows are physically distributed across AMPs.
Common causes include:
- Low-cardinality or poorly chosen Primary Indexes
- Highly repetitive key values
- Correlated data leading to uneven hash buckets
Data skew is persistent because it’s tied to physical storage. It can be measured through Viewpoint’s space usage metrics or by querying DBC.TableSizeV.
Typical mitigations:
- Redefine the PI using higher-cardinality or multi-column keys
- Where the PI cannot change, create a secondary or join index to improve access balance
- Review data demographics periodically to ensure balanced hash distribution after major loads
Runtime (Step-Level) Skew
Even perfectly balanced tables can exhibit runtime skew during query execution.
This occurs when:
- A small set of high-frequency key values dominates a join or aggregation
- The optimizer misestimates cardinalities due to stale or missing statistics
- Redistribution or aggregation steps produce uneven intermediate results
Accurate statistics are the optimizer’s main defense against runtime skew. They provide a detailed view of data demographics—cardinality, value frequencies, and histogram boundaries—so that join plans and spool redistribution can be chosen intelligently.
In complex queries, Teradata can reduce skew by applying heavy-hitter handling.
Examples include:
- Splitting a join into two paths: one for skewed values (isolated heavy hitters) and one for the non-skewed remainder, then combining results
- Performing local aggregation before global redistribution
- Reordering joins to balance AMP activity
Runtime skew is transient: it appears only when specific data patterns or join conditions trigger it.
Detecting Skew
- Viewpoint displays CPU, I/O, and spool skew per query and per step
- DBQL step tables reveal AMP-level workload variance
- TASM workload rules can raise exceptions when skew exceeds thresholds
Mitigating Skew
- Design Primary Indexes that distribute rows evenly
- Keep statistics current on join and filter columns
- Rework joins or aggregations to minimize unnecessary data redistribution
- Identify and isolate hot key values that overload specific AMPs
In Teradata, data skew is structural and persistent, while runtime skew is situational and query-specific.
Skew in Snowflake
Snowflake’s architecture separates storage and compute. Data is stored in compressed, columnar micro-partitions. When a query runs, the optimizer divides work into execution tasks that are distributed across compute threads within a virtual warehouse.
Each task processes a set of micro-partitions. If certain tasks handle significantly more data or complex join workloads, they take longer, causing compute skew.
Detecting Skew
Skew in Snowflake is visible through the Query Profile.
- Each operator (Scan, Join, Aggregate, etc.) appears as a box in the query’s DAG
- Bars under each box represent parallel task runtimes
- When some bars are much longer, the work is uneven
- Query Profile also reports rows and bytes processed per task, allowing detailed comparison
Mitigating Skew
- Define clustering keys that align with join and filter columns to maximize pruning
- Write sargable predicates so Snowflake can skip irrelevant micro-partitions
- Avoid broadcast joins for large tables
- Use SYSTEM$CLUSTERING_INFORMATION to analyze clustering depth and overlap
- Re-evaluate clustering periodically after major DML or time-based loading
Unlike Teradata, Snowflake does not maintain fixed data-to-processor mappings. Any skew arises at query runtime, driven by data clustering, partition pruning, and join design. While poor clustering can repeatedly cause the same queries to appear skewed, the imbalance itself is always observed during execution, not in the persistent data layout.
Teradata vs. Snowflake — Key Differences
Aspect | Teradata | Snowflake |
---|---|---|
Parallel unit | AMP (fixed physical process) | Execution task (runtime thread) |
Data distribution | Hash-based via Primary Index | Dynamic micro-partition assignment |
Skew types | Data skew (persistent) and runtime skew (transient) | Runtime skew |
Detection | Viewpoint, DBQL, TASM | Query Profile |
Persistence | Data skew static; runtime skew query-dependent | Dynamic per execution |
Typical causes | Poor PI, stale stats, heavy-hitter keys | Skewed join keys, poor clustering, pruning inefficiency |
Fix | Redefine PI, recollect stats, balance joins | Improve clustering, pruning, join design |
Takeaway
Both systems battle the same root cause: uneven parallel workload.
In Teradata, skew can exist both at rest (from data distribution) and at runtime (from joins or aggregations). The solution lies in good data modeling, complete statistics, and intelligent query planning.
In Snowflake, skew appears during execution. It’s shaped by clustering quality, pruning efficiency, and key distribution.