Teradata in the Cloud: Advantages and Disadvantages of MPP Database Systems

Introduction to Teradata in the Cloud

Cloud databases are pressuring traditional data warehousing MPP systems.

This blog post will illustrate the reasons for this and outline the pros and cons of each database system.

The Architecture of an MPP Database System

MPP database systems employ a shared-nothing architecture, wherein each node has its own CPU, main memory, and mass storage device.

Other examples of MPP database systems include Netezza, Amazon Redshift, and Microsoft Azure Synapse Analytics.

An MPP system evenly distributes data across all nodes.

All MPP database systems share the same fundamental architecture. However, how data is organized and stored on nodes — by rows or columns — varies among them.

Each manufacturer has devised its own approach.

Teradata can store data in both rows and columns. Additionally, data can be retrieved from a Column Partitioned Table using a Primary Index.

Netezza uses its hardware, FPGA, and zone maps to identify where the searched data is not located and limits the queries to the required columns.

Almost all MPP database systems offer the following three options for distributing data (Netezza, Amazon Redshift, Microsoft Azure Synapse):

  • Distribute All

    Tables are copied entirely to all nodes. This is ideal for small tables as they are already available for joining on all nodes without the need to copy data (Teradata does not offer this kind of distribution, but it copies whole tables when necessary during query execution for join preparation)
  • Distribute By Hash

    Here the distribution occurs via a key (in Teradata, it is the primary index).
  • Distribute Randomly

    The data of a table is distributed evenly but randomly across all nodes. In Teradata, this is achieved by using so-called NOPI tables.
Dist
Distribution Options of an MPP Database System

Advantages of an MPP Database System

  • Performance
    We can achieve excellent performance by distributing the load across nodes.
  • Scalability and Concurrency

    In principle, MPP systems can be scaled linearly by adding new nodes (CPU, memory, and mass storage). Doubling the number of nodes doubles the performance.

Disadvantages of an MPP Database System

  • Complexity
    Most MPP database systems come with hardware that has been specially optimized to achieve the best performance.

    These include the BYNET in Teradata, which performs specific tasks (sorting and merging of answer sets), or the special hardware from Netezza to restrict the data being read.

    This often makes the system complicated and expensive.
  • Distribution of Data

    MPP database systems’ significant advantage is also their biggest disadvantage: distributing data evenly across nodes. Even distribution is essential, but choosing the right distribution key is up to the user.

    Modern cloud databases like Snowflake do not have this problem because they are shared-data systems where all nodes can access the common database.
  • Downtime

    Scaling the system up or down requires downtime during which data must be redistributed evenly across the old and new nodes.
  • Lack of Elasticity

    MPP database systems are not as well-suited as cloud databases due to their lack of elasticity.

    MPP database systems can scale, but this takes weeks as hardware has to be added or data restructuring is needed. Snowflake, for example, can scale in real time without any downtime. Snowflake is a true cloud database.

    Many manufacturers now offer their databases in the cloud, but essential features are missing. I don’t consider them cloud databases, but it’s a matter of definition.

    Teradata is also available in the cloud. But what remains of Teradata if there is no BYNET anymore? What would Netezza be without its dedicated hardware? I think running your database on somebody else’s computer is not sufficient.

Related Services

🏗️ Planning a Data Platform Migration?

Architecture-first approach: we design before a single line of code is written. Zero data loss across every migration delivered.

Our Migration Services →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.