5 Things That Break When You Migrate From Teradata To Snowflake

Most organizations approach a Teradata-to-Snowflake migration as a translation exercise: convert the SQL, move the data, and validate the results. The technical migration succeeds. Then the first quarterly bill arrives at double the projected budget, dashboards queue for 20 minutes during batch windows, and data-loading pipelines that ran in 45 minutes on Teradata now take 5 hours.

The problem is not Snowflake. The problem is that Teradata professionals carry assumptions into Snowflake that are not just incomplete, but actively counterproductive. Decades of Teradata expertise have created instincts for data distribution, workload management, query optimization, and operational patterns that simply do not apply in Snowflake’s architecture. Worse, following those instincts often produces the opposite of the intended result.

Here are five things that reliably break when experienced Teradata teams migrate to Snowflake, and what to do about each one.

Table of Contents

1. Your Cost Model

From predictable licenses to consumption billing, where every query is a spending decision.

In Teradata’s traditional on-premises model, capacity is purchased upfront through fixed-term licenses. Once the system is installed, the marginal cost of running an additional query is zero. No meter ticks, no credits burn, no invoice line grows. Budget planning is predictable because the annual cost is the same regardless of utilization, whether it’s 20% or 95%.

Snowflake inverts this entirely. Every second of compute is billed. A warehouse that sits idle but running burns credits at its full rate. Serverless features consume credits in the background without requiring any user action. Storage costs multiply quietly through Time Travel and Fail-safe retention. Cross-region data transfer fees appear on the bill with no visible equivalent in Teradata.

The most common migration failure is projecting Snowflake costs the way you would estimate a Teradata license: as a single line item. In reality, Snowflake bills across multiple cost dimensions, each with its own pricing model and its own hidden multipliers: virtual warehouse compute, serverless compute, cloud services, storage, and data transfer.

What to do: Build your cost model across all five billing dimensions before writing a single line of migration code. Set auto-suspend to 60 seconds (not the default 5 or 10 minutes) as your starting point. Monitor costs weekly, not annually. In Teradata, the cost of inefficiency is measured in performance. In Snowflake, it is measured in money.

2. Your Query Optimization Instincts

No Primary Index, no secondary indexes, no statistics collection. Everything you relied on is gone.

In Teradata, query performance revolves around data distribution. The Primary Index determines where each row physically resides by hashing column values to specific AMPs. Single-AMP access when PI columns appear in equality conditions. All-AMP scans when they do not. Data skew when PI values are unevenly distributed. Expensive redistribution when joining tables with different PIs. Every table design decision flows from this deterministic distribution.

Snowflake has no Primary Index and no equivalent concept. Data is stored in arrival order as immutable micro-partitions, with comprehensive metadata (min/max values, null counts, distinct value counts) automatically recorded for each column. The optimizer uses this metadata for partition pruning, skipping partitions that cannot contain relevant data. There is no hash distribution to optimize, no permanent assignment of rows to specific processors.

Secondary indexes (USI, NUSI), join indexes, and COLLECT STATISTICS routines also have no direct Snowflake equivalents. Snowflake’s Search Optimization Service offers similar point-lookup acceleration for specific columns, but it is an Enterprise Edition feature with serverless costs. Micro-partition pruning and automatic metadata replace the need for traditional statistics collection entirely.

The instinct to restructure joins to avoid cross-AMP redistributions, replace MERGE statements with explicit DELETE-then-INSERT logic, or add clustering keys on every join column (mirroring secondary indexes) is counterproductive. These optimizations are based on Teradata’s hash-distributed, AMP-local processing model. In Snowflake, the fastest query version is often the simplest.

What to do: Start with the simplest possible query. Do not restructure joins to avoid redistributions, because Snowflake handles data movement differently. Do not recreate every secondary index as a clustering key. Remove all COLLECT STATISTICS routines from your migration plan. Use the Query Profile (Snowflake’s visual execution plan) to diagnose actual bottlenecks before applying any optimization.

3. Your Workload Management Strategy

TASM’s priority scheduling disappears. Workload isolation replaces contention management.

In Teradata, all workloads share resources among the same AMPs. The Teradata Active System Management (TASM) and its lighter counterpart TIWM classify queries, assign priorities, and manage resource allocation within this shared system. The Priority Scheduler implements a six-tier hierarchy: from the critical TDAT system tier down through virtual partitions, tactical workload, Service Level Goal tiers, and finally the timeshare tier. Resources flow downward: higher-priority workloads consume what they need, and whatever remains cascades to lower tiers.

Snowflake replaces this entire paradigm. There is no priority scheduler because there is no shared resource pool to arbitrate. Instead of managing contention within a single system, Snowflake eliminates contention through isolation. Each workload class gets its own warehouse, sized appropriately for its queries, suspended when idle, and completely isolated from other workloads. A runaway query on the data science warehouse cannot impact dashboard response times. A massive data load cannot slow interactive analysis.

Running all workloads on a single warehouse, as Teradata does, recreates the contention problems that TASM was designed to manage, but without TASM’s sophisticated priority scheduling. ETL batch jobs, dashboards, ad-hoc queries, and data science notebooks all compete for the same compute, leading to queuing, oversizing, and cost escalation that isolation would have prevented.

What to do: Stop thinking about priority and start thinking about isolation. Create separate warehouses for separate workload types: batch processing on a Large warehouse that runs only during ETL windows, dashboards on an X-Small with aggressive auto-suspend, analytics on a Medium that scales as needed. Each warehouse suspends independently when idle, so you pay only for the compute you use. Isolation is cheaper than contention management when you are paying per second.

4. Your Tactical Workload

Teradata’s guaranteed sub-second response time for critical queries has no direct equivalent in Snowflake.

Teradata’s tactical workload tier occupies a privileged position in the Priority Scheduler hierarchy. Tactical queries get immediate access to system resources, can interrupt and suspend any executing request, and use reserved AMPs Worker Tasks (AWTs). The tactical tier can consume as many resources as needed, with only a 5% margin reserved for lower tiers. This makes tactical workloads the mechanism of choice for time-critical, low-latency queries such as real-time lookups, operational dashboards, and customer-facing applications.

The tactical tier works in Teradata because of a specific architectural alignment. Tactical queries are designed to be single-AMP or few-AMP operations, using Primary Index access paths (UPI, NUPI), or single-table joins over common primary indexes. Only nested and merge joins qualify, as these are the only join strategies that involve a limited number of AMPs. The Teradata optimizer verifies that the execution plan meets tactical requirements. If it does not, the request is automatically demoted to a lower-priority tier to prevent misuse of critical resources.

Snowflake has no equivalent mechanism. There is no priority scheduler, no reserved compute threads, and no concept of interrupting one query to serve another. When a query arrives at a Snowflake warehouse, it either executes immediately (if concurrency slots are available) or queues behind other running queries. The MAX_CONCURRENCY_LEVEL parameter (defaulting to 8, configurable up to 32) controls how many queries can run simultaneously, but there is no way to guarantee that a specific query class gets priority access.

The Snowflake approach to tactical-style workloads is architectural rather than scheduling-based. You create a dedicated small warehouse exclusively for low-latency queries. Because this warehouse serves only tactical queries, there is no contention from batch jobs or analytical workloads. The warehouse can be X-Small (minimizing cost per query) and configured with aggressive auto-suspend. If concurrency demand increases, multi-cluster scaling (Enterprise Edition) can automatically add clusters.

This approach works differently from Teradata’s tactical tier. Teradata guarantees priority within a shared system. Snowflake guarantees isolation through a dedicated system. The result is similar (low-latency responses for critical queries), but the mechanism and its limitations differ. In Teradata, a well-designed tactical query leverages single-AMP access and ROWHASH-level locking for maximum concurrency. In Snowflake, the same query relies on micro-partition pruning and the absence of competing workloads on its dedicated warehouse.

What to do: Create a dedicated X-Small or Small warehouse for tactical queries. Assign it exclusively to the roles or applications that run time-critical lookups. Do not mix tactical queries with batch or analytical workloads. Ensure your queries are selective enough for effective partition pruning, which is Snowflake’s equivalent of single-AMP access. If your tactical workload requires guaranteed sub-second response times regardless of load, consider Snowflake Hybrid Tables (available on AWS and Azure for all commercial editions), which are designed for OLTP-style single-row operations.

5. Your Data Loading Pipelines

FastLoad, MultiLoad, TPump, and BTEQ collapse into COPY INTO and staged files.

Teradata offers a rich ecosystem of specialized loading utilities: FastLoad for bulk loading empty tables, MultiLoad for updates and merges on populated tables, TPump for continuous micro-batch ingestion, and BTEQ for interactive and transactional loads. The Teradata Parallel Transporter (TPT) unifies these under a common scripting framework. Each utility has distinct requirements, trade-offs, and operational patterns that Teradata professionals have internalized over years of production experience.

Snowflake replaces this entire ecosystem with COPY INTO commands operating on staged files, supplemented by Snowpipe for automated continuous loading and Snowpipe Streaming for low-latency ingestion. This simplification is elegant, but it creates a trap: Teradata professionals often apply familiar loading patterns to Snowflake, resulting in inefficient pipelines, wasted compute, and unnecessary costs.

The most damaging pattern is loading a single monolithic file per table, as FastLoad processes a single data stream. For compressed files (the typical case), Snowflake’s COPY INTO processes each file with a single thread. Loading one large file eliminates the parallelism that makes Snowflake’s loading fast. Splitting loads into 100- to 200-megabyte compressed files enables parallel processing across multiple threads, often dramatically reducing load times.

The other critical anti-pattern is using single-row INSERT statements. In Teradata, BTEQ can handle reasonable INSERT volumes with row-level commits. In Snowflake, each INSERT triggers a complete transaction cycle, and the platform is optimized for analytical workloads rather than high-frequency transactional patterns. Bulk loading with COPY INTO processes thousands of rows per second with minimal overhead. Single-row INSERTs may process only tens of rows per second while consuming similar compute resources. The cost difference can reach 50 to 100 times.

What to do: Split large exports into 100 to 250 megabyte compressed files (Parquet format loads significantly faster than CSV). Stage data in the same cloud region as your Snowflake account to avoid egress charges. Convert all BTEQ row-by-row INSERT patterns to batch COPY INTO before migration. For continuous loading needs (replacing TPump), evaluate Snowpipe (serverless, event-driven) or Snowpipe Streaming (low-latency, API-based) based on your latency and cost requirements.

The Common Thread

Each of these five breakdowns has the same root cause: applying Teradata’s architectural assumptions to a platform that works on fundamentally different principles. Teradata’s shared-nothing, fixed-capacity architecture rewards a specific set of instincts: optimize data distribution, manage contention through priority scheduling, collect statistics, and design for single-AMP access. Snowflake’s shared-data, consumption-based architecture rewards different instincts: optimize for cost efficiency, isolate workloads, let the platform handle data organization, and continuously monitor spending.

Your Teradata expertise is valuable. You understand data warehousing, query optimization, workload management, and data modeling at a deep level. The challenge is not unlearning what you know, but channelling that expertise into Snowflake’s different mechanisms for achieving the same goals. The principles are often similar. The implementations are different.

The traps are avoidable. With awareness and the right practices, you can build a Snowflake environment that delivers the performance your organization needs at costs that make sense. The key is recognizing where your instincts will serve you well and where they will lead you astray.

Roland Wenzlofsky has over 20 years of experience with Teradata, working with major finance and telecommunications companies across Europe. He is the author of ‘Teradata SQL Tuning’ and the upcoming ‘Teradata to Snowflake: Avoiding the Traps’ (with Tibor Kovacs).

Related Services

☁️ Planning a Cloud Migration?

End-to-end migrations from Teradata, Oracle, Netezza to Snowflake, Databricks, Postgres. Zero data loss guaranteed.

See Our Migration Case Studies →

5 Things That Break When You Migrate from Teradata to Snowflake

1. Your Cost Model

2. Your Query Optimization Instincts

3. Your Workload Management Strategy

4. Your Tactical Workload

5. Your Data Loading Pipelines

The Common Thread

📊 Data Platform Migration Survey

Stay Ahead in Data Warehousing

Leave a Comment Cancel reply