Tag Archive

Tag Archives for " fault tolerance "
2

The Teradata Node Review

What is a Teradata Node?

Teradata Nodes are Linux systems (several nodes are packed together into one  Teradata cabinets) with several physical multicore CPUs and plenty of memory. On top of the Linux operating system, the parallel database extension software (PDE) is executed.

On each node, the primary processes of a Teradata Systems are being performed (see as well our article about the Teradata high-level architecture):

– The Parsing Engines
– The AMPs
– Two redundant BYNETs for the communication between AMPs and Parsing Engines.

As we already know, there is a lot of parallelisms built in within one node as workload is distributed evenly across all AMPs.

Scalability is one of the main benefits of the Teradata architecture; multiple nodes can be interconnected to an even bigger system.

In theory, doubling the number of nodes would cause a doubling in performance. In real life, this is a fairy tale.
You will spot the problem with this theory of linear scalability quickly if you think about the fact that this would require perfect parallelism in your workload.

But as all of us know from practice, we always are fighting with a skewed (wrong distributed) workload. We could have hundreds of nodes; they will not help us in performance if the workload of our SQL statement ends up on one AMP. Keep this in mind, when adding nodes to your system to avoid disappointment.

Further, from a fault tolerance point of view, we are limited regarding the number of nodes. Such architectures are not fault tolerant. A Teradata system scaling up to thousand of nodes is not possible. Concepts like hot standby nodes may relieve the situation a bit but are a costly method of buying some fault tolerance.

In the terminology of parallel systems, a single node is called symmetric multiprocessing node, any system containing at least two nodes is named a massive parallel system (MPP).

While the communication network (BYNET) within one node is realized as a piece of software, the network between nodes obviously has to be implemented in hardware. Still, the purpose is the same: Allow AMPs and Parsing Engines to communicate with each other, even across different nodes.

For performance and fault tolerance reasons there are always two BYNETs available.

As long as both networks are operating without errors, they are used simultaneously to increase flow rate. In case one of the networks fails there is still a backup available, and the systems continue its operation. Only the failure of both networks would make the Teradata inoperative.

While some years ago, the BYNET was one of the big advantages of Teradata as it takes over the tasks of sorting and merging data, relieving the CPU, today with the availability of multi-core processors, this benefit probably is not significant anymore. The change may is related to the switch from the proprietary BYNET to InfiniBand as the new backbone for data transmission.

Introduction to Parallel Database Architectures

The History of Parallel Database Architectures

Parallel computing is not a new invention, being available now for more than 30 years. Parallel database architectures followed soon.

Everything started with the shared memory architecture, which is still widely used this day in servers:

shared memory.

The shared memory architecture is a multiprocessor system with a common memory, a single operating system and a shared disk or storage area network for storing the data. Processes are scattered across the CPU’s.

One significant disadvantage of the shared memory architecture is that it is not scaling very well and therefore, even with modern multi-core processors, you will not find any setup containing more than a few hundred of processors.

Current databases are making use of other two architectures, namely the shared disk and the shared nothing architecture (Teradata is based on the shared nothing architecture):

shared nothing

Both architectures overcome the problem of limited scaling.

In the shared disk architecture, the processors communicate over a network, but still, only a shared disk or storage area network exists for storing the data. In a shared disk architecture, therefore, you can find two networks: the network for inter-processor communication and the storage system for communication with the disk.

The shared nothing architecture uses local disks for each processor. Only the network for processor communication exists.

Databases based on any of this architecture are built to execute queries in parallel, but in the shared nothing architecture, the data is additionally distributed across the disks.

Everybody familiar with Teradata will easily spot the relationship between the shared nothing architecture and the AMPs (processors), drives and the BYNET  (processor communication network).

One important goal for any shared nothing architecture based database system is the even distribution of data across the available disks. We all know how Teradata is doing this, namely by hashing the primary index columns and storing the rows on the related AMPs.

A column-store database would instead  distribute different columns across the disks. Still, the goal is the same: even data distribution to optimally exploit parallelism.

Parallel databases and fault tolerance or why Hadoop rules

Parallel databases are nowadays state of the art. But there is one issue which can’t be handled in a good way by parallel databases (based on any of the described architectures): Fault tolerance.

All architectures behave okay when there are a limited amount of processors available. The statistical probability of processor failure is small in this case.

The situation would look quite different in the event of a system with thousands of processors. Chance of a processor failure would be near to 100%.

Parallel databases based on one of the three described architectures fundamentally are not fault tolerant, that’s  why Teradata, for example, offers Hot Standby Nodes doing nothing else than replicating the data across a network. I would consider this a very costly approach, but it is a usual method (not only for Teradata).

If we are comparing parallel databases with the Map Reduce Framework (please take a look at my related article for further details), we will quickly spot the parallel file system (HDFS) and fault handling as a good base for fault tolerance.

It is easy to recognize that long running batch operations fit much better Map Reduce than a parallel database system like Teradata.

I guess now it becomes clearer why all big database vendors start playing in the big data arena.  Big data needs a different architecture.

Currently, the trend is “Hadoop over SQL” to make the transition as smooth as possible as SQL is the primary interaction language for most database systems.

We will see what the next years will bring. Teradata’s shot is SQL-H, other database vendors develop their integrations. None of them is currently really convincing me.

>