There are a lot of arguments brought forward for in-memory databases. If we had unlimited memory, holding all data in memory would allow the fastest access. I think there is no doubt about this.
As we all know, there always was and will be a vast difference in access times comparing memory against hard disks or solid-state drives (SSD).
While pure in-memory databases like SAP HANA offer the fastest access to the data, Teradata decided to go for the 80-20 rule. It states that only 20% of the data is accessed very frequently, and it is good enough (from a cost perspective) to keep this very frequently accessed data in memory. The less accessed data is made available on slower storage (In reality, it is not exactly 80-20, but you get the idea).
Teradata classifies data as cold, warm, hot, and very hot – mainly, this is tightly coupled to the data access frequency. The overall architecture is named Teradata Intelligent Memory.
Teradata Intelligent Memory is available, starting with Teradata 14.10, and disposable for all systems running on this release.
To understand the concepts behind “data temperature” and the relation to the storage type, we must know how data is stored in a Teradata system.
One requirement for the Teradata AMPs is that data blocks must be made available in memory before operating the rows. Therefore, each AMP has its memory (FSG cache). Starting with Teradata 14.10, AMPs now have additional memory used for the very hot data. Data blocks moved into this memory may stay there even for days.
But how does Teradata handles the “not-so-hot” data, i.e., cold, warm, and hot data?
In case solid-state drives are available, this would be the next choice. We all know that solid-state storage is faster than hard disks.
But even on hard drives, it greatly affects which cylinder data blocks are located.
Below you can see the structure of a hard drive. Each circle represents one disk cylinder.
As you may know, data blocks are stored in cylinders. A full cylinder can be read into memory when a drive spins one time. As you will immediately recognize, Teradata can place more data blocks in the outer (red) cylinder than in the inner (blue) cylinder. Therefore, more blocks from the outer cylinder can be read into memory with one spin than from the inner cylinder.
Teradata takes advantage of this fact and puts the cold data into the inner cylinders, while Teradata stores hotter data in the outer cylinders.
There is a hierarchy in Teradata’s intelligent memory strategy from memory to SSD to disks (while considering the different cylinders’ positions).
While Teradata’s solution can never be as fast as real in-memory database solutions, it is a fair tradeoff between storage limits (which you have on in-memory databases) and performance.