Tuning the Teradata LIKE Operator
Usually, queries which are using the LIKE operator are causing a full table scan.
If the LIKE operator matches from the left, the Teradata Optimizer can quickly search in the statistic histograms to get demographic data metrics such as the number of rows, etc.
Unfortunately, the Teradata Optimizer has no way to use the statistics for a LIKE match such as LIKE ‘%ABC%’. In this case, it will expect a pessimistic selectivity and always choose the full table scan as the least expensive access path. The full table scan can be done on the base table, a join index, or a NUSI.
Still, in the absence of additional qualifiers against other columns, no indexed access is possible.
As mentioned above, the Teradata Optimizer may choose to do the full table scan on a smaller NUSI (if the number of distinct values is high) or a Join Index.
But as this is not ensured, we can use a trick to enforce the full table scan on a smaller table.
Let’s assume that the big table we are accessing with the LIKE operator looks like this:
Below is a typical query against Table1:
SELECT * FROM Table1 WHERE colA like ‘%ABC%’;
To avoid the full table scan of the big table, we create a second table, only containing the minimum set of columns required for the LIKE match (in our example we only need “colA”) plus the unique primary key column “PK”:
CREATE SET TABLE Helper
PK INTEGER NOT NULL,
) UNIQUE PRIMARY INDEX (PK);
INSERT INTO Helper SELECT PK, colA FROM Table1;
COLLECT STATISTICS ON Helper COLUMN(PK);
Finally, we transform the original SQL statement:
t01.PK = t02.PK
WHERE Helper.colA like ‘%ABC%’;
Teradata will now do a full table scan on the smaller table “Helper”, and an AMP-local rowkey based merge join with the big table “Table1”.
Of course, the benefit is only given if scanning the big table is much more expensive than scanning the small table and doing a consecutive AMP-local merge join.