Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts

Roland Wenzlofsky

April 23, 2023

minutes reading time

Longer texts, such as descriptions of other columns, are stored in VARCHAR() columns. The maximum character capacity for storage is 64000 for the LATIN character set and 32000 for the UNICODE character set. When considering the impact of longer texts on query performance, it is important to note that the number of rows per data block decreases with longer text inputs, resulting in more IO-intensive full table scans.

The Teradata CLOB Datatype

CLOBs are a superior option for storing long strings as they are stored in a distinct sub-table. By separating CLOBs from the main table, queries accessing the table with a full table scan can be significantly enhanced. This is because Teradata reads fewer data blocks when the CLOB column is not selected, resulting in fewer IOs.

Below is an example table with detailed descriptions in each row’s “Desc” column.

CREATE TABLE Varchar_Table
(
ID INTEGER NOT NULL,
COl1 CHAR(01),
Desc VARCHAR(30000)
) PRIMARY INDEX (ID);

The absence of a secondary index requires copying all data blocks from the disk to the FSG cache, regardless of our interest in only the “ID” column content. Given that all rows contain extensive textual information in the “Desc” column, this results in a substantial amount of IOs.

SELECT ID FROM Varchar_Table WHERE COl1 = 'X' ;

Changing the data type to CLOB from VARCHAR() can significantly reduce resource consumption by allowing more rows to be stored per data block and requiring less disk transfer.

CREATE TABLE CLOB_Table
(
ID INTEGER NOT NULL,
COl1 CHAR(01),
Desc CHARACTER LARGE OBJECT
) PRIMARY INDEX (ID);

SELECT ID FROM Varchar_Table WHERE COl1 = 'X'; -- fewer blocks have to be moved from disk to FSG cache

Selecting a CLOB column does not confer any performance advantage. Instead, the base table and the sub-table containing the CLOB data must be read.

SELECT ID,Desc FROM Varchar_Table WHERE COl1 = 'X' ;

Take a brief survey and receive the book for free Get instant access

Share0

Tweet0

Share0

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

I feel like I’m missing something. The text explaining they benefit of the CLOB is very clear and understandable. But none of your examples reference the CLOB table. And you talk about joining to the CLOB table but no example is provided, and was what I was looking for!!!

Roland Wenzlofsky says:
at
The CBLOB sub-table is used automatically by Teradata. It is maintained by the system and not visible to us. Whenever your SELECT column list contains the CLOB column, it is taken from the sub-table, otherwise, only the base table has to be accessed.
I hope this made it clearer
Reply

So, in this fairly specific scenario, you get ‘columnar’ by the back door. I like it! I useful little trick, will have to see how it works in practice but this is definitely worth considering. One possible issue that comes to mind is the handling of CLOBs. They are not straight forward, but if the tool you’re using to query the data is ok with them then this should be fine.
Even without that issue, this is a nice idea.

Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts

The Teradata CLOB Datatype

Roland Wenzlofsky

Fast multi-file export of Teradata query results using only Teradata SQL Assistant

The Teradata AMP Worker Task

Boost Your Teradata Performance – The Critical Role of NOT NULL Declarations

Optimizing Teradata SQL Queries by Avoiding Full Table Scans and Utilizing Secondary Indexes