Architecture Archives -

Data Engineering Is Not Dying. But Management Is Doing Its Best to Kill It.

04/12/2026 by DWH Pro Admin

Companies are laying off experienced architects and doubling down on offshore coding teams — at the exact moment AI is automating the implementation work those teams do. The math says AI-augmented senior teams are cheaper and better than offshoring. So why is management choosing the most expensive option and calling it efficiency?

Why Z-Ordering Fails on Skewed Data — and Liquid Clustering Does Not

03/29/2026 by Roland Wenzlofsky

Z-ordering and Liquid Clustering both aim to improve Databricks query performance through data skipping. But when your data is skewed, one of them quietly becomes useless. A visual explanation of why — and how the Hilbert curve changes everything.

Microsoft Fabric vs. Databricks: What Enterprise Teams Actually Need to Know

04/11/202603/08/2026 by DWH Pro Admin

When Microsoft launched Fabric, it promised to unify analytics under one roof. Databricks, meanwhile, has been building its lakehouse platform for years. Both aim to be your central data platform — but they make fundamentally different bets on how much complexity to show you. Having worked across Teradata, Snowflake, and Databricks environments in European banking …

You Migrated From Teradata to Spark and Threw Away the One Thing That Made It Fast

04/11/202603/08/2026 by Roland Wenzlofsky

If you have spent any amount of time working with Teradata, you know that the Primary Index is one of the most important design decisions you make. It determines how data is distributed across AMPs and whether your joins are fast or slow. Choosing the wrong Primary Index is one of the most common causes …

The 15-Year Detour: How the Data Industry Spent Billions Reinventing SQL

04/12/202603/04/2026 by Roland Wenzlofsky

Somewhere around 2020, the data world quietly arrived at a conclusion that Teradata engineers could have told you in 1984: SQL on a massively parallel architecture is a pretty good way to process large volumes of data. The path to get there was anything but quiet. It involved billions in capital, an entire generation of …

The Medallion Architecture Is Not New. We Just Called It Something Else.

03/09/202602/22/2026 by Roland Wenzlofsky

Why data warehouse professionals have been doing “bronze, silver, gold” for over 20 years. If you have been working in data warehousing for any length of time, the first time you heard about the “medallion architecture” you probably had one reaction: We already do this. You were right. The medallion architecture, popularized by Databricks as …

Teradata’s Semicolon Optimization vs. Snowflake’s Architecture — Two Worlds, One Goal

03/09/202610/16/2025 by Roland Wenzlofsky

1. The Forgotten Performance Trick: A Semicolon That Saves Time For decades, Teradata developers have quietly used one of the smallest but most powerful performance optimizations in BTEQ:a semicolon at the start of a line. This isn’t just a style choice.It tells Teradata to combine all statements into one multi-statement request, parsed and executed as …

Teradata Hybrid: Bridge or Destination?

03/09/202609/30/2025 by Roland Wenzlofsky

For many years, Teradata was the undisputed leader in large-scale data warehousing. Banks, insurers, and telcos built their most critical systems on it. Today, the market is very different. Cloud-native databases such as Snowflake, BigQuery, and Databricks have set new standards in elasticity and simplicity. The question is: what role will Teradata play in this …

Improving Stored Procedure Performance with Teradata MAPS Architecture

03/08/202605/05/2023 by Roland Wenzlofsky

Occasionally, it is necessary to utilize a cursor within a stored procedure to execute specific functionality. I recently encountered a stored procedure that contained a loop with multiple INSERT statements executed through a cursor. This particular cursor was designed to process only a limited number of rows. Despite this, the stored procedure took up to …

Introduction to Apache Spark: A Powerful Solution for Big Data Processing and Analytics

03/08/202605/05/2023 by Roland Wenzlofsky

Introduction Processing and analyzing large volumes of data quickly and efficiently is essential in today’s data-driven world. Apache Spark, an open-source big data processing engine, is a leading solution for handling massive datasets that offers a fast and flexible alternative to traditional data processing frameworks like Hadoop’s MapReduce. This article introduces Apache Spark, explores its …