Understanding the Relationship Between RDBMS and Hadoop: A Map Reduce Example

Map Reduce is a vital feature of Hadoop that numerous database vendors are now integrating into their RDBMS. To illustrate the correlation between a conventional RDBMS and Hadoop, consider the following example.

To illustrate how RDBMS logic is executed in Hadoop via the MapReduce algorithm, we use the example of a SQL aggregation statement involving the joining of two tables.

[highlight] SELECT
Segment,
SUM(balance)
FROM
Client t01
INNER JOIN
ClientType t02
ON
t01.clienttype_id = t02.clienttype_id
GROUP BY Segment;[/highlight]

The above SQL statement summarizes the client balances for each client segment (private, business, etc.).

Let’s see how this can be implemented with a MapReduce algorithm.

The aforementioned SQL aggregation statement requires more than one Map-Reduce task to be executed.

Initially, the JOIN step is executed wherein the data is sorted according to clienttype_id. This enables each reducer to emit the corresponding balance for every segment. In simpler terms, the first Map-Reduce task performs the join operation between the Client and ClientType table, without carrying out any grouping.

join task
join task

The second Map-Reduce task sorts the data by segment for consolidation in a single Reducer. This step involves summarization akin to that in a GROUP BY statement.

grouping

The previous example illustrated how RDBMS functionality can be implemented using Map Reduce. This helps build a clearer understanding of Teradata’s capability to delegate joining and aggregation functionality to Map Reduce.

Related Services

🏗️ Planning a Data Platform Migration?

Architecture-first approach: we design before a single line of code is written. Zero data loss across every migration delivered.

Our Migration Services →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.