Longer texts, such as descriptions of other columns, are stored in VARCHAR() columns. The maximum character capacity for storage is 64000 for the LATIN character set and 32000 for the UNICODE character set. When considering the impact of longer texts on query performance, it is important to note that the number of rows per data block decreases with longer text inputs, resulting in more IO-intensive full table scans.
The Teradata CLOB Datatype
CLOBs are a superior option for storing long strings as they are stored in a distinct sub-table. By separating CLOBs from the main table, queries accessing the table with a full table scan can be significantly enhanced. This is because Teradata reads fewer data blocks when the CLOB column is not selected, resulting in fewer IOs.
Below is an example table with detailed descriptions in each row’s “Desc” column.
CREATE TABLE Varchar_Table ( ID INTEGER NOT NULL, COl1 CHAR(01), Desc VARCHAR(30000) ) PRIMARY INDEX (ID);
The absence of a secondary index requires copying all data blocks from the disk to the FSG cache, regardless of our interest in only the “ID” column content. Given that all rows contain extensive textual information in the “Desc” column, this results in a substantial amount of IOs.
SELECT ID FROM Varchar_Table WHERE COl1 = 'X' ;
Changing the data type to CLOB from VARCHAR() can significantly reduce resource consumption by allowing more rows to be stored per data block and requiring less disk transfer.
CREATE TABLE CLOB_Table ( ID INTEGER NOT NULL, COl1 CHAR(01), Desc CHARACTER LARGE OBJECT ) PRIMARY INDEX (ID);
SELECT ID FROM Varchar_Table WHERE COl1 = 'X'; -- fewer blocks have to be moved from disk to FSG cache
Selecting a CLOB column does not confer any performance advantage. Instead, the base table and the sub-table containing the CLOB data must be read.
SELECT ID,Desc FROM Varchar_Table WHERE COl1 = 'X' ;