What is a Teradata Node?

Teradata Nodes are Linux systems that come packed into a single cabinet, each containing multiple physical multicore CPUs and ample memory. These systems run the parallel database extension software (PDE) on top of the Linux operating system.

On each node, the primary processes of Teradata Systems are being performed (see our article about the Teradata high-level architecture):

– The Parsing Engines
– The AMPs
– Two redundant BYNETs for the communication between AMPs and Parsing Engines.

Numerous parallelisms are established within a node by uniformly distributing the workload among all AMPs.

Teradata architecture offers excellent scalability, allowing for numerous nodes to connect to a vast system.

The concept of achieving a performance boost by doubling the number of nodes is often called linear scalability. However, this notion is merely a myth in practice. The flaw in this theory becomes evident when considering the need for flawless parallelism in your workload.

As experienced individuals, we are aware of the issue of a disproportionate workload. Adding numerous nodes will not contribute to efficiency if the workload of our SQL statement remains on a single AMP. It is essential to consider this when expanding your system to prevent dissatisfaction.

In addition, our fault tolerance capabilities are restricted, and such architectures lack resilience. It is unfeasible for a Teradata system to scale up to thousands of nodes. While hot standby nodes can provide some level of fault tolerance, they come at a high cost.

In parallel system terminology, a lone node is referred to as an asymmetric multiprocessing node. A system comprised of a minimum of two nodes is classified as a massive parallel system (MPP).

The communication network, BYNET, is software-based within a single node but hardware-based between nodes, with the objective of facilitating communication between AMPs and Parsing Engines across varying nodes.

Two BYNETs are consistently available for reasons of performance and fault tolerance.

Both networks are utilized concurrently to optimize the flow rate, provided they operate without errors. In the event of a network failure, a backup system is accessible, ensuring uninterrupted operation. Teradata will become inoperative only if both networks fail.

Years ago, Teradata’s BYNET provided a notable advantage by sorting and merging data, thus reducing CPU workload. However, this benefit may no longer hold as much significance with the advent of multicore processors. The shift from BYNET to InfiniBand as the primary data transmission backbone could contribute.

  • Hi Falcon. Thanks. I added the link.

    Regarding your question: Yes, each AMP has its own working memory. The so-called FSG-cache of each AMP is used to hold the data blocks read from disk.

  • Roland – Thanks for an informative article. I have 2 comments/questions:

    1. Can you also include the link in the article to ‘Teradata high-level architecture’ that you have referred to in your second paragraph?

    2. How is TD’s RAM organized? Does each AMP have its own memory in addition to dedicated disk space? If not, can TD still be considered a shared-nothing architecture RDBMS?

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like