Teradata vs. Snowflake: Why GROUP BY Performance Differs at Scale

When migrating analytical workloads from Teradata to Snowflake, one subtle but important performance factor often gets overlooked: how the two systems handle GROUP BY operations on huge tables.

The SQL looks the same, but the execution engines behave differently. If you’ve relied on Teradata for years, you may be surprised by Snowflake’s behavior.


GROUP BY in Database Systems

Most databases implement GROUP BY with either:

  • Hash aggregation: rows are placed into an in-memory hash table keyed by the grouping columns.
  • Sort aggregation: rows are sorted, then consecutive keys are aggregated.

We assume both Teradata and Snowflake use hash aggregation as their main strategy, since this is standard in large-scale systems, though Snowflake does not disclose full internals.


Teradata’s Approach

  • Low cardinality: AMPs can aggregate locally, sending only a small set of partial results across the BYNET.
  • High cardinality: pre-aggregation is ineffective; rows are redistributed to AMPs by group key before aggregation, creating network and spool pressure.
  • Primary Index advantage: if grouping is on the PI, no redistribution is needed — aggregation is local, which is the fastest case.
  • Skew sensitivity: uneven group distribution can overload a single AMP, leading to slow queries or spool failures.

Snowflake’s Approach

Snowflake applies a consistent multi-stage process:

  1. Local partial aggregation per node.
  2. Redistribution by group key.
  3. Final aggregation.

Differences to note:

  • No PI equivalent: redistribution is almost always required.
  • Memory and spilling: intermediate results spill to local SSD if they don’t fit in memory; SSD is finite, so extreme cases can fail.
  • Scalability: warehouses can be resized to add memory and SSD capacity — a tuning lever absent in Teradata.
  • Skew handling: uneven groups still slow queries, but adaptive execution introduced in 2024 helps rebalance work across nodes.

Disk Spilling

Neither system has infinite memory:

  • Teradata spills to spool; if quotas are exceeded, queries fail.
  • Snowflake spills to SSD; usually queries just run slower, though disk exhaustion can cause further spilling to object storage.

Migration Implications

  • Low-cardinality GROUP BY or PI-based queries that ran extremely fast in Teradata may lose their edge in Snowflake, since redistribution is part of the plan.
  • High-cardinality or skewed queries that often failed in Teradata tend to run more predictably in Snowflake, though sometimes slower.
  • Scaling in Snowflake gives you a way out of memory pressure, at the cost of higher credits; Teradata offers no such flexibility.

Final Thought

Cardinality, skew, and memory limits strongly influence aggregation performance.

Teradata: optimizer picks strategies based on cardinality and PI design, but queries are highly sensitive to skew and spool quotas.
Snowflake: always redistributes and aggregates in stages, assumed to be hash-based, with adaptive execution and scaling as safety valves.

For migration teams, the lesson is simple: test your GROUP BY queries. What ran best in Teradata may behave differently in Snowflake, especially at the extremes of very low or very high cardinality.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.