There is a lot of arguments brought forward for in-memory-databases. In principle, if we had unlimited memory, holding all data in memory would allow the fastest access. I think there is no doubt about this.
As we all know, there always was and will be a huge difference in access times comparing memory against hard disks or solid-state drives (SSD).
While pure in-memory databases like SAP HANA offer the fastest access to the data, Teradata decided to go for the 80-20 rule. It states that only 20% of the data is accessed very frequently, and it is good enough (from a cost perspective) to keep this very frequently accessed data in memory. The less accessed data is made available on slower storages (In reality, it is not exactly 80-20, but you get the idea).
Teradata classifies data in cold, warm, hot, and very hot – mainly, this is tightly coupled to the data access frequency. The overall architecture is named Teradata Intelligent Memory.
Teradata Intelligent Memory is available, starting with Teradata 14.10 and disposable for all systems running on this release.
To understand the concepts behind “data temperature” and the relation to the storage type, we have to know how data is stored in a Teradata system.
One requirement for the Teradata AMPs is that data blocks must be made available in memory before any operation on the rows can occur. Therefore since ever, each AMP had its memory (FSG cache). Starting with Teradata 14.10, AMPs have now additionally memory which is used for the very hot data. Data blocks moved into this memory may stay there even for days.
But how Teradata handles the “not-so-hot” data i.e., cold, warm, and hot data?
In case solid-state drives are available, this would be the next choice. We all know that solid-state storage is faster than hard disks.
But even on hard drives, it makes a big difference on which cylinder data blocks are located.
Below you can see the structure of a hard drive. Each circle represents one disk cylinder.
As you may know, data blocks are stored in cylinders. When a drive spins one time, a full cylinder can be read into memory. As you will immediately recognize, many more data blocks can be placed in the outer (red) cylinder than in the inner (blue) cylinder. Therefore, with one spin, much more blocks from the outer cylinder can be read into memory than from the inner cylinder.
Teradata takes advantage of this fact and puts the cold data into the inner cylinders, while hotter data will be stored in the outer cylinders.
There is a hierarchy in Teradata’s intelligent memory strategy from memory to SSD to disks (while taking into account the different positions of cylinders).
While Teradata’s solution probably never can be as fast as real in-memory database solutions, it is a fair tradeoff between storage limits (which you have on in-memory databases) and performance.