Teradata vs. Redshift: A Comparison of Join Strategies and Architecture

arch3

Teradata and Redshift share similar architectures and data distribution methods. Teradata’s AMPs store portions of table data, while Redshift utilizes slices. There are notable differences in the way data is stored on file systems. Teradata can function as a Column Store, which can be determined on a per-table basis. However, the primary advantage lies in …

Read more

Understanding Teradata Join Estimation: Heuristics and Importance of Statistics Collection

tune4

What is Teradata Join Estimation? This article demonstrates the functioning of Teradata Join Estimation in the absence of statistics. It presents the heuristics employed to estimate row count and emphasizes the importance of collecting statistics on all join columns. Teradata Join Estimation Heuristics The worst-case scenario involves joining two tables without any collected statistics. We …

Read more

Designing Small Reference Tables for Teradata: Storing All Rows on One AMP for More Efficient Queries

tune4

When designing tables for Teradata, it is important to distribute the rows across all AMPs in the system evenly. For instance, on a 100-AMP system with 100,000 rows, the objective would be to allocate roughly 1,000 rows per AMP. I agree with the design guideline for many tables in a Teradata system. Nevertheless, a specific …

Read more

VantageCloud Lake: Turbocharge Your Data Warehousing with Teradata’s Innovative Solution

arch1

Introduction Parallel database architectures have undergone significant advancements over the past four decades, transitioning from shared memory architecture to shared disk architecture and, finally, to the more efficient shared-nothing architecture. Databases designed specifically for cloud environments incorporate elements of shared-disk and shared-nothing architectures. Teradata is a powerful and scalable relational database management system designed to …

Read more

Improving Query Performance with Teradata Statistics Extrapolation and Object Use Counts (OUC)

tune2

Teradata introduced several new features, including one that caught our attention: object use counts (OUC). This feature optimizes the calculation of extrapolated statistics, improving query performance significantly. Before version 13.10, changes made by DML statements were not logged, and the optimizer relied solely on dynamic amp sampling, leading to incorrect estimates for skewed tables. Additionally, …

Read more

Optimizing Teradata Queries: From No Index to Hashed NUSI

tune3

The initial situation without any index In this blog, I will demonstrate how to optimize a query using Teradata’s tools. We will begin with the following test scenario: The data is evenly distributed. To demonstrate the query’s selectivity for the tested indexes we will define later, I assigned a significant portion of rows the same …

Read more

Understanding Teradata Hash Collisions – A Case Study

tune3

To comprehend the issue of Teradata hash collisions, I will briefly explain how rows are allocated. If you are unfamiliar with Teradata Architecture or require a refresher, I suggest reading the following article beforehand: As you know, a hashing algorithm distributes a table’s rows to the AMPs. The Foundations The hashing algorithm accepts one or …

Read more

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.