Teradata Intelligent Memory

Roland Wenzlofsky

May 2, 2014

minutes reading time

There are a lot of arguments brought forward for in-memory databases. If we had unlimited memory, holding all data in memory would allow the fastest access. I think there is no doubt about this.

As we all know, there always was and will be a vast difference in access times comparing memory against hard disks or solid-state drives (SSD).

While pure in-memory databases like SAP HANA offer the fastest access to the data, Teradata decided to go for the 80-20 rule. It states that only 20% of the data is accessed very frequently, and it is good enough (from a cost perspective) to keep this very frequently accessed data in memory. The less accessed data is made available on slower storage (In reality, it is not exactly 80-20, but you get the idea).

Teradata classifies data as cold, warm, hot, and very hot – mainly, this is tightly coupled to the data access frequency. The overall architecture is named Teradata Intelligent Memory.

Teradata Intelligent Memory is available, starting with Teradata 14.10, and disposable for all systems running on this release.

To understand the concepts behind “data temperature” and the relation to the storage type, we must know how data is stored in a Teradata system.

One requirement for the Teradata AMPs is that data blocks must be made available in memory before operating the rows. Therefore, each AMP has its memory (FSG cache). Starting with Teradata 14.10, AMPs now have additional memory used for the very hot data. Data blocks moved into this memory may stay there even for days.

But how does Teradata handles the “not-so-hot” data, i.e., cold, warm, and hot data?

In case solid-state drives are available, this would be the next choice. We all know that solid-state storage is faster than hard disks.

But even on hard drives, it greatly affects which cylinder data blocks are located.

Below you can see the structure of a hard drive. Each circle represents one disk cylinder.

As you may know, data blocks are stored in cylinders. A full cylinder can be read into memory when a drive spins one time. As you will immediately recognize, Teradata can place more data blocks in the outer (red) cylinder than in the inner (blue) cylinder. Therefore, more blocks from the outer cylinder can be read into memory with one spin than from the inner cylinder.

teradata intelligent memory

Teradata takes advantage of this fact and puts the cold data into the inner cylinders, while Teradata stores hotter data in the outer cylinders.

There is a hierarchy in Teradata’s intelligent memory strategy from memory to SSD to disks (while considering the different cylinders’ positions).

While Teradata’s solution can never be as fast as real in-memory database solutions, it is a fair tradeoff between storage limits (which you have on in-memory databases) and performance.

  • I have some basic questions.
    if you can help that will be really helpful.

    Question 1
    Memory used by AMP
    Spool Space
    FSG Cache
    Please confirm

    Question 2
    Is FSG Cache memory is same as Spool space where data is moved when Joins takes Place
    Question 3
    Each AMP has its own RAM, So Ram is not FSG Cache memory. Its memory is fixed in AMP or it is also decided by PDE as same is done for FSG
    Question 4
    Initially, Data is distributed across AMPs so where we store this data on . is its Vdisk or Pdisks. Vdisk is a combination of Pdisks ?
    Question 5
    If Possible can please someone provides an image for what inside an AMP?

  • Avatar
    Timm+Rüger says:

    That’s more like catching up with multi-temperature data management that DB2 10.1 has introduced in 2012 already.
    Now DB2 with BLU is again a step further it offers true in-memory capability as SAP HANA does. In-memory is more than just storing the records in RAM. It is also about applying intelligent features to reduce the data volume that has to keep in memory resp. pushed through the execution pipelines of the CPUs. That’s why data skipping, columnar storage and sort-order preserving compression algorithms like Huffmann encoding (IBM calls it “actionable compression) and SIMD processing play a critical role in true in-memory databases. Unfortunately, Teradata is not yet there.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like