Architecture Archives - DWHPRO

Archive

Category Archives for "Architecture"
2

Teradata Volatile Tables – Expectation vs. Reality

What is a Teradata Volatile Table?

Volatile tables are very similar to derived tables. In contrast to these, their lifetime is not limited to the duration of a query, but they exist until the end of the session in which they were created.

Volatile tables are created with the following syntax:

CREATE VOLATILE TABLE Customer
(
   CustomerId INTEGER NOT NULL,
   LastName VARCHAR(255)
) PRIMARY INDEX (CustomerId) ON COMMIT PRESERVE ROWS;

A comparison between Volatile Tables, Global Temporary Tables, and Derived Tables can be found here:

Comparison of temporary table types

Teradata Volatile Table Features

What kind of space does a Volatile Table use?

Volatile Tables are materialized in the spool space of the user.

Who can access a Volatile Table?

Volatile tables can only be accessed in the user's session in which they were materialized.

Is there a DBC table where I can see which Volatile Tables I have materialized?

No, but with the command HELP VOLATILE TABLE, all materialized tables are listed.

Where is the DDL of a volatile table stored if not in the Data Dictionary (DBC)?

The table definition is kept in the AMP cache.

How long is a volatile table available?

Volatile tables are automatically removed when a session is terminated or a DROP TABLE statement is executed.

What happens if Teradata is restarted while I have materialized a volatile table?

Unfortunately, the table is lost along with its contents.

Can an ALTER TABLE be executed on a volatile table?

This is not possible.

Can I create a secondary index on a volatile table?

This is only possible when the table is created. Therefore, you cannot first create the table and then execute a CREATE INDEX statement.

Can I create a Join Index or Hash Index on a volatile table?

No, that's not possible.

Who is the owner of a Volatile Table?

A Volatile Table belongs neither to a database nor to a user, but to the session of a user.

Can I create the same volatile table in different sessions?

Yes, but the content is session-specific, of course.

Can I use FALLBACK protection?

Yes, but this is not really useful because volatile sessions do not survive a system restart. It only makes sense to create a volatile table while an AMP is down.

Can a volatile table be used for performance tuning?

Volatile tables offer several possibilities for this. E.g. no DBC table is needed and therefore there can be no performance degradation if the competing workload on one of these tables holds e.g. a wide lock.

Can I create statistics on a volatile table?

Statistics are allowed, but when using the statement DIAGNOSTIC HELPSTAT ON FOR SESSION you have to be aware that they are still proposed in the EXPLAIN plan. This is because they are not available to the optimizer in a DBC table.

Can a Volatile Table be created as a PPI table?

Yes, this is possible.

Teradata Volatile Table Restrictions

  • No referential integrity can be defined
  • No CHECK constraints are allowed
  • DEFAULT values are not allowed
  • No TITLE can be defined for columns
  • No Join Index or Hash Index is allowed
  • Indexes can't be named
  • Permanent Journals can't be used
  • Column Partitioning is not allowed
  • No privileges can be set with a GRANT or REVOKE. That wouldn't make any sense either

Teradata Hot Standby Node explained in 2 Minutes

A hot standby node is a member of a group of nodes (clique) and there are no virtual processes running on it (AMPs, Parsing Engines).

The hot standby node waits for its mission, which takes place when a node crashes out of its clique.

Then all the AMPs of the crashed node are immediately migrated to the hot standby node. This happens practically without any loss of performance:

Node crashes, migration of AMPs takes place

The ingenious thing about the design is the following: As soon as the crashed node is online again, it becomes the new hot standby node:

Recovered Node becomes the new HSN

1

Doing Teradata Locking the Right Way

What is locking?

Locking of database objects is a must in RDBMS to avoid that multiple users can change and access the same data at the same time in a way that is causing data inconsistencies.

In other words, whenever a user is accessing or changing data, the affected objects are locked to ensure integrity.

For each access to data, Teradata decides how granular a lock must be and for what type of activity other users are locked.

The granularity of locking can also be increased during execution (e.g. from row lock to table lock).

Which Objects can be locked?

  • Database Locks: All Objects in a database are locked
  • Table Locks: All table rows are locked
  • View Locks: All table accesses by a view are locked
  • Partition Locks: All rows of a partition or several partitions are locked
  • Partition Range Locks: A range of a table's partitions are locked
  • Rowhash Locks: One ore more rows of a table are locked

Rowhash Locking Details

If a RowHash is locked this affects one or more Rows because all Rows with the same Primary Index Value have the same RowHash.

A RowHash lock is used in the following cases:

  • An unpartitioned table is queried via the primary index.
  • An UPDATE or DELETE statement is made by using a primary index or unique secondary index (the rowhash is locked in the USI subtable and in the base table).

For example, take the following UPDATE statement:

UPDATE Customer SET Gender = 'm';

Without using a WHERE condition, a write lock to the table is required.

If we change the query and use a WHERE condition on the primary index (CustomerId), a rowhash lock is used:

UPDATE Customer SET Gender = 'm' WHERE CustomerId = 1;

Rowhash Locking for Row-Partitioned Tables

If rowhash locking is performed on a row-partitioned table, the partition (or a range of partitions) can be locked at the same time, i.e. it can be locked more specifically.

In total, the RowHash is then locked within one or more partitions. This means that other requests can access rows in the table that are either in different partitions or have a different rowhash.

Teradata Locking Types

What is the Exclusive Lock?

Locks are only applied to databases or tables.
They are the strictest locks because all requests of all other users are queued if an object is exclusively locked. Exclusive locks are rarely used, usually when databases are restructured.

What is a Write Lock?

Write locks allow the owner of the lock to change data, all other users only reading with a so-called “dirty read”, because the data can be inconsistent. As long as there is a write lock, no new write or read locks are allowed.

What is a Read Lock?

Whenever a user holds a read lock on an object, no write locks or exclusive locks can be obtained by any other users. However, unlimited read locks (and access locks) can be held by other users on the same object at the same time.

What is an Access Lock?

Access locks allow a user to read “inconsistent” data. Access locks work even if another user holds already a write lock on the object. Access Locks are not allowed if there is an exclusive lock on the object.

How Teradata Locks are selected?

Teradata automatically takes care of which lock is best suited for a particular situation. But you as a user can influence it. Here is an example:

SELECT * FROM Customer;

The above query normally causes a read lock.

But if the consistency of the data is not important, you can downgrade the lock to an access lock:

LOCKING Customer FOR ACCESS
SELECT * FROM Customer;

The LOCKING FOR ACCESS modifier is particularly interesting because the data can be accessed even when the table is being written.

Analogous to the previous example you can upgrade the lock from a read lock to a write lock:

LOCKING Customer FOR WRITE
SELECT * FROM Customer;

Any user who needs a read lock, a write lock or exclusive lock on the customer table must now wait until your query is finished.

Locking Modifiers

LOCK DATABASE/TABLE/VIEW/ROW
FOR ACCESS/EXCLUSIVE/READ/WRITE
[{NOWAIT}

NOWAIT is optional and means that the SQL statement will be aborted if the desired lock is not available.

Locking Summary

  • Exclusive Lock: No concurrent access allowed
  • Write Lock: No concurrent read, write, and exclusive locks allowed
  • Read Lock: No concurrent write and exclusive locks allowed
  • Access Lock: No concurrent exclusive locks allowed
teradata locking
Teradata locking queues

See also:

Locking – The official Teradata documentation

6

Teradata Architecture Is So Famous, But Why?

Before we go into Teradata architecture in more detail, we need to talk about how a computer is built. After all, this is also the basis of a Teradata system.

Teradata architecture – why does everyone copy it?

Teradata, as one of the pioneers in data warehousing, was and is a role model for many subsequent database systems in terms of architecture.

Even if Teradata has come into the years, the developers of the system have already considered many details from the beginning that still make Teradata capable of competing today.

If we look at various modern database systems today, such as Redshift from Amazon (or Netezza), for example, we can recognize many things that were used by Teradata for the first time.

Teradata was designed from the beginning for parallelism in every smallest detail and can therefore still be found today among the top RDBMS.

single computer
Single Computer

Data is permanently stored on mass memory and is loaded into the main memory for processing by the CPU.

It is important to understand that accessing the mass storage device is many times slower than accessing the main memory. Accessing the main memory is much slower than accessing data already in one of the CPU caches.

Before data can be processed by the CPU, it must be loaded into the main memory (if it is not already there).

The Teradata architecture can now be easily imagined as a number of individual computers that can communicate with each other:

teradata architecture
Teradata System

Teradata Data Distribution

To split the workload, Teradata uses a hashing algorithm that distributes the rows of each table evenly among the so-called AMPs (we will talk later in this article about exactly what an AMP is and what its tasks are. For now it's sufficient to know that AMPs are doing the main work).

teradata hashing
Data Distribution by Hashing

The Parsing Engine

An important part of the Teradata architecture is the Parsing Engine (PE).

The Parsing engine receives a request (e.g. an SQL statement) and generates an execution plan for all AMPS that are required to complete the request. Ideally, the plan is structured so that all AMPs start and finish their tasks at the same time. This ensures optimal parallel utilization of the system.

parsing engine
The Parsing Engine controls the AMPs

As you can see in the figure above, between the AMPs and the parsing engine is the BYNET, which represents the communication network over which both the data and instructions are exchanged. We talk about the BYNET in detail later in this article.

The Parsing Engine has the following main tasks:

  • Logging on and Logging Off Sessions
  • The parsing of requests (syntax check, checking authorizations)
  • Preparation and optimization of the execution plan
  • The Parsing Engines uses statistics to build an optimized plan.
  • Controlling the AMPs by Instructions
  • Communication with the client software
  • EBCDIC to ASCII conversion in both directions
  • Transfers of the result of a request to the client tool

Each Teradata System can use multiple parsing engines.

The number of parsing engines can be increased by the system as needed because each parsing engine can only process a limited number of sessions.

The Teradata AMP

AMPs are the real workers in a Teradata System who execute the instructions they receive from the Parsing Engine (the Execution Plan).

AMPs are independent units that have their own part of the main memory and mass storage allocated to them.

The allocation is exclusive, i.e. no AMP has access to the resources of another AMP.

These are the main tasks of an AMP:

  • Storing and retrieving of rows
  • Sorting of rows
  • Aggregation of rows
  • Joining
  • Locking of tables and rows
  • Conversion ASCII to EBCDIC (if the client is a mainframe)
  • Management of its assigned space
  • Sending of rows to the Parsing Engine or other AMPs (via the BYNET)

The Teradata Node

Parsing engines and AMPs are processed and run on a node. A node is usually a Linux machine equipped with multiple physical CPUs.

Each node can run hundreds of AMPs. Each AMP has its own portion of the main memory and its own portion of mass memory (called virtual disk).

Teradata Node
The Teradata Node

Nodes are connected to a disk array, and each AMP is assigned a part of it as a logical disk. Nowadays, SSDs are used and management is done by the Teradata Intelligent Memory system. But the principle is the same.

Disk array
Node with Disk Array managed by Teradata Intelligent Memory

Massive Parallel Processing

A Teradata system can consist of a large number of nodes. These, in turn, are connected to each other via BYNET.

However, this is a physical network, while the BYNET within a node connects the AMPs with the parsing Engine and with each other, is implemented in software:

Two Nodes combined with Hardware BYNET. Within each Node BYNET is Software
See how Hashing is done on Teradata

Another view on the Teradata Design by TutorialsPoint

What I Wish Everyone Knew About The Parallel Teradata Design

Teradata Query Parallelism

If we run a query on a Teradata system, it runs in parallel in every step, no matter if it is a join, sorting, aggregation step, etc.

The great advantage is that this applies to each step of the query process. That's what makes Teradata unique. The big advantage Teradata has over other database systems is that its high level of parallelism was part of the Teradata architecture from the start many years ago (when most of its components were implemented in hardware instead of software)

In the course of time, even new features have been made possible in order to further increase parallelism. However, much would not have been possible without the architecture underlying Teradata.

Parallel Execution across the AMPs

AMPs are independent processes that perform a wide range of activities independently of each other. Each SQL query is first broken down into subtasks assigned to the AMPs. Each AMP completes its tasks and delivers a partial result. When all AMPs are ready, the overall result is returned.

Parallelism at AMP level is one of the main reasons for a Teradata System offering tremendous performance when used the right way.

AMPs are not specialized but capable of performing any kind of task.

Today there are hundreds of AMPs on each Teradata System available.

The Tasks of a Teradata AMP

  • Reading of Rows
  • Writing of Rows
  • Row Locking
  • Sorting
  • Aggregating
  • Index creation and maintenance
  • Maintaining the transaction log
  • Backup and Recovery
  • Bulk and Transactional Loading

teradata AMP

The Components of a Teradata AMP

Each AMP has its own exclusive resources:

  • Logical storage unit
  • Memory
  • CPU

Teradata AMP

Since Teradata is a shared-nothing architecture, all resources of an AMP are only assigned to this AMP. An extension of the system (by adding hardware) therefore goes hand in hand with the possibility of linear growth in performance.

The Teradata Primary Index – Hash Partitioning

Parallelism is achieved by hash partitioning. The existing data is evenly distributed among the existing AMPS, with the goal that each AMP must perform approximately the same.

Hash partitioning works great to distribute large amounts of data to the AMPs. The drawback is that individual AMPs can represent a bottleneck if the task distribution is not even – skewing is the result and this pressure will often occur.

Skewing is one of the main problems a performance tuner on a Teradata system has to solve.

Hash partitioning is easily achieved by defining one or more columns for each table from which a hash value is calculated. This hash value determines the AMP for each row. The columns used for hash partitioning are the so-called Primary Index:

CREATE TABLE Customer(Customer_ID BIGINT NOT NULL,Lastname VARCHAR(500),Firstname VARCHAR(500)) UNIQUE PRIMARY INDEX (Customer_Id);

Teradata Pipelining Steps

Pipelining increases the parallelism of queries even more. Pipelining makes it possible to start a step even if the predecessor step is not yet finished.

As already mentioned, a request is split into several steps. Each step represents a subtask. This can be, for example:

  • Read all rows of a table (simple step)
  • Update a subset of table rows (simple step)
  • Read two tables, redistribute them and join them (complex step)

Steps can have different complexity, and AMPs may also need to interact with each other.

In a join step (which is complex), pipelining allows starting already the join activity while rows of the participating tables are still being redistributed.

Teradata Parallel Multi-Steps

While pipelining steps are nested into each other, i.e. the input of the predecessor step is made available to the successor as early as possible, Teradata also has another level of parallelism available.

Steps that are independent of each other can be executed in parallel.

The Teradata BYNET

Teradata is a shared-nothing architecture and as we know the work is done in parallel by the AMPs.

But since there is a need for AMPs to communicate with each other, a network is needed to exchange both messages and data: The BYNET.

The Tasks of the BYNET

However, BYNET is more than just a normal network. It has functionality that has been specially developed for Teradata:

  • Message Delivery: Guarantees that messages arrive at the target AMP
  • Coordinate multiple AMPs working on the same step
  • Sorting of the final result set when sending to the client
  • Minimizes the number of AMPs needed for a step
  • Congestion control to avoid an overloaded network

Message Passing & the BYNET

To better understand the tasks of BYNET we have to introduce a new kind of virtual process: The Parsing Engine. It is responsible for creating the execution plan of a request. The BYNET is the interface between the Parsing Engine and the AMPs.

Messages can be sent from the Parsing Engine via BYNET to the AMPs, but BYNET is also responsible for the AMP to AMP communication.

BYNET can send messages to all AMPs, a group of AMPS, or even a single AMP.

Sorting the Final Answer Set

Which is unique in Teradata:

The sorting of the ResultSet is done in parallel, at each level (AMP, Node, BYNET, Parsing Engine) data is pre-sorted to avoid an expensive Sort Step at the end:

  • Each AMP locally sorts its data (this is done in parallel)
  • Each Node takes one buffer of data from all its AMPs and sorts it (buffer by buffer by AMP)
  • The BYNET passes one buffer per Node to the Parsing Engine which does the final sort.

These were the essential components of the Teradata Shared Nothing architecture. More details about how data is stored can be found here:

The ultimate Teradata Physical Storage guide – Part 1

The ultimate Teradata Physical Storage guide – Part 2

3

Teradata Surrogate Keys Guidelines

What are Teradata Surrogate Keys?

Surrogate keys are used in Teradata to map the natural keys of source systems to a unique key.

Usually, one or more natural key columns are mapped to a surrogate key that is worth an INTEGER. Often a consecutive number is generated.

For a surrogate key in Teradata, the ROW_NUMBER() function can be used, or an IDENTITY column.

The advantage of an IDENTITY column is that a new surrogate key is generated by the system as soon as a row is inserted into the key table.

However, it is important to define the IDENTITY column so that no duplicates are created.

For this, the IDENTITY column must be defined with GENERATED ALWAYS and NO CYCLE.

GENERATED always prevents the surrogate key column from being updated, NO CYCLE prevents the reuse of values – If the highest possible number for an integer data type is reached, no new surrogate keys can be generated and an error is reported.

Detailed information about IDENTITY columns is available in table DBC.IdCol

Disaster Reload of Key Tables with IDENTITY columns

If a key table that uses IDENTITY columns has to be reloaded, please pay attention to the following to avoid duplicates:

  • The key table needs to be re-created with a new START and MINVALUE numbers to exclude the numbers that were generated before to avoid the same number to be generated again.
  • The definition of the identity column needs to be changed from “GENERATED ALWAYS” to “GENERATED BY DEFAULT” to be able to reload the keys that were generated before.
  • New surrogate keys will be generated only when NULL values are passed to the identity column.

Here is an example where the highest value of the IDENTITY column is 200,000:

CREATE MULTISET TABLE TheKeys 
(
   NATURAL_KEY VARCHAR(500) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
   SURROGATE_KEY BIGINT GENERATED ALWAYS AS IDENTITY
   (START WITH 1 INCREMENT BY 1 
    MINVALUE -999999999999999 
    MAXVALUE 999999999999999 
    NO CYCLE),
)
UNIQUE PRIMARY INDEX (NATURAL_KEY);

If we need to reload the key table the following DDL is required:

CREATE MULTISET TABLE TheKeys 
(
   NATURAL_KEY VARCHAR(500) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
   SURROGATE_KEY BIGINT GENERATED BY DEFAULT AS IDENTITY
   (START WITH 1 INCREMENT BY 1 
    MINVALUE 200000 
    MAXVALUE 999999999999999 
    NO CYCLE),
)
UNIQUE PRIMARY INDEX (NATURAL_KEY);

Why should you use Surrogate Keys?

There are several reasons why it is a good idea to replace the natural keys with Teradata surrogate keys.

Different source systems deliver the same natural key for different information.

You may have more than one source system delivering the same value in the natural key for the same target table but with a different meaning.

This fact requires you to use surrogate keys because keeping the natural key from both sources would not allow you to distinguish the objects.

Different source systems deliver a different natural key for the same information.

But it can also be that the same information comes from different source systems with different natural keys. For example, the customers of a bank may be kept in different source systems. If it is necessary to represent these as a single customer in the data warehouse, surrogate keys are perfectly suited for this.

Unfortunately, many Teradata data warehouse implementations decide against surrogate keys and put the effort of integrating the information into the reports.

This approach will ultimately cause more costs than having a right surrogate key design from the begin (don't forget, such an approach requires to repeat the logic of combining object information into each and every single report).

Exchange of Source System leads to the Delivery of new Natural Keys.

If the natural keys of the source system are used, and you undergo a replacement of the original system you end up in a bad situation.

In this case, let's compare the effort caused by it.

If we have surrogate keys available, only the mapping table (Natural Keys -> Surrogate Keys) has to be adjusted, all surrogate keys have to be assigned once to the new natural keys.

If we only have the natural keys available in our data warehouse, this usually means a complete re-design. The old natural keys must be replaced everywhere by the new natural keys!

In general, you never should tightly couple your data model to the source system structures. An invoice table should be called Invoice and not something like SAP-Invoice. Data Warehousing is not the 1:1 storage of operational data.

At the begin of a project, it is always more agile to avoid surrogate keys and integrate source system tables 1:1.

However, this only shifts costs to the end of the chain. Changes that have to be made afterward are many times more expensive!

Teradata Surrogates Keys and Performance

Surrogate keys, if properly used, may increase your performance as you can replace several natural key columns by an integer column.

Often, natural key columns are character columns. From a performance point of view, this makes no difference for the hashing of the primary index. Where there is a negative effect on performance, however, is with joins, since Teradata must take character sets, etc. into account here.

Depending on the number of columns and data types of the natural key, this can have a negligible impact on performance or a noticeable effect.

1 2 3 10
>