Optimizing Queries with LIKE Operator in Teradata

Roland Wenzlofsky

May 2, 2023

minutes reading time


Queries using the LIKE operator typically result in a full table scan.

When the LIKE operator matches from the left, the Teradata Optimizer can swiftly search the statistic histograms for demographic data metrics, including row count.

Regrettably, the Teradata Optimizer cannot utilize statistics for a LIKE match, specifically LIKE ‘%ABC%’. Accordingly, it will anticipate a weak selectivity and consistently opt for the full table scan as the least costly access path. The full table scan can be implemented on the base table, a join index, or a NUSI.

However, accessing indexes necessitates further qualifying against other columns.

The Teradata Optimizer may perform a full table scan on a smaller NUSI or a Join Index if there are many distinct values.

However, as it is not guaranteed, we have a technique to compel a comprehensive table scan of a compact table.

Assuming the table we are querying with the LIKE operator appears as follows:

CREATE SET TABLE Table1
(
PK INTEGER NOT NULL
colA VARCHAR(100),
colB VARCHAR(100),
colC VARCHAR(100),
...
colZ VARCHAR(100)
) UNIQUE PRIMARY INDEX (PK)

Here is a typical query:

SELECT * FROM Table1 WHERE colA like '%ABC%';

To prevent a complete table scan on the large table, we generate a secondary table that solely includes the essential columns necessary for the LIKE comparison (in this instance, “colA” is sufficient) along with the distinct primary key column “PK”:

CREATE SET TABLE Helper
(
PK INTEGER NOT NULL,
colA  VARCHAR(100)
) UNIQUE PRIMARY INDEX (PK);
INSERT INTO Helper SELECT PK, colA FROM Table1;
COLLECT STATISTICS ON Helper COLUMN(PK);

We have now converted the initial SQL query:

SELECT Table1.*
FROM
Table1 t01
INNER JOIN
Helper t02
ON
t01.PK = t02.PK
WHERE Helper.colA like '%ABC%';

Teradata will perform a complete table scan on the smaller “Helper” table and execute an AMP-local merge join based on row keys with the larger “Table1” table.

The benefit is only applicable if scanning the large table is more costly than scanning the small table and executing a consecutive AMP-local merge join.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>