Improve Query Performance By Using The Teradata CLOB Datatype For Storing Longer Texts

Longer texts, such as descriptions of other columns, are stored in VARCHAR() columns. The maximum character capacity for storage is 64,000 for the LATIN character set and 32,000 for the UNICODE character set. When considering the impact of longer texts on query performance, it is important to note that the number of rows per data block decreases with longer text inputs, resulting in more IO-intensive full table scans.

The Teradata CLOB Datatype

CLOBs are a superior option for storing long strings as they are stored in a distinct sub-table. By separating CLOBs from the main table, queries accessing the table with a full table scan can be significantly enhanced. This is because Teradata reads fewer data blocks when the CLOB column is not selected, resulting in fewer IOs.

Below is an example table with detailed descriptions in each row’s “Desc” column.

CREATE TABLE Varchar_Table
(
ID INTEGER NOT NULL,
COl1 CHAR(01),
Desc VARCHAR(30000)
) PRIMARY INDEX (ID);

The absence of a secondary index requires copying all data blocks from the disk to the FSG cache, regardless of our interest in only the “ID” column content. Given that all rows contain extensive textual information in the “Desc” column, this results in a substantial number of IOs.

SELECT ID FROM Varchar_Table WHERE COl1 = 'X' ;

Changing the data type from VARCHAR() to CLOB can significantly reduce resource consumption by allowing more rows to be stored per data block and requiring less disk transfer.

CREATE TABLE CLOB_Table
(
ID INTEGER NOT NULL,
COl1 CHAR(01),
Desc CHARACTER LARGE OBJECT
) PRIMARY INDEX (ID);

SELECT ID FROM CLOB_Table WHERE COl1 = 'X'; -- fewer blocks have to be moved from disk to FSG cache

Selecting a CLOB column does not confer any performance advantage. Instead, both the base table and the sub-table containing the CLOB data must be read.

SELECT ID,Desc FROM CLOB_Table WHERE COl1 = 'X' ;

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

3 thoughts on “Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts”

I feel like I’m missing something. The text explaining they benefit of the CLOB is very clear and understandable. But none of your examples reference the CLOB table. And you talk about joining to the CLOB table but no example is provided, and was what I was looking for!!!

Roland Wenzlofsky

01/08/2018 at 6:52 pm

The CBLOB sub-table is used automatically by Teradata. It is maintained by the system and not visible to us. Whenever your SELECT column list contains the CLOB column, it is taken from the sub-table, otherwise, only the base table has to be accessed.

I hope this made it clearer
Reply

So, in this fairly specific scenario, you get ‘columnar’ by the back door. I like it! I useful little trick, will have to see how it works in practice but this is definitely worth considering. One possible issue that comes to mind is the handling of CLOBs. They are not straight forward, but if the tool you’re using to query the data is ok with them then this should be fine.
Even without that issue, this is a nice idea.

Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts

The Teradata CLOB Datatype

📊 Data Platform Migration Survey

Stay Ahead in Data Warehousing

3 thoughts on “Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts”

Leave a Comment Cancel reply