fbpx

Teradata TO_CHAR

By sp3006

November 7, 2019


The Teradata TO_CHAR Function

The Teradata TO_CHAR function can be used to:

  • Convert numeric expressions to character strings
  • Convert dates, times, interval data types, or timestamps to character strings

Here is an example of a numeric expression argument. It can be any of these data types: BYTEINT, SMALLINT, INTEGER, BIGINT, DECIMAL/NUMERIC, FLOAT/REAL/DOUBLE PRECISION, or NUMBER.

The second argument is a format string (More about format strings you can find in the Teradata documentation). It has to be CHAR or VARCHAR.

SELECT TO_CHAR(123456789,'999999999');

Hint: If the query returns only ‘#’ characters, this means that the conversion failed:

SELECT TO_CHAR(123456789,'99');
--> ###

Here is an example of a date expression argument. The first argument has to be one of the following data types: DATE, TIME, TIME WITH TIME ZONE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, or an Interval data type. The second argument has to be CHAR or VARCHAR:

SELECT TO_CHAR(DATE'2019-11-09', 'DD-MON-YYYY' );
--> '09-NOV-2019'

Teradata TO_CHAR and Statistic Estimations

Of course, we are all excited when Teradata provides new features with every new release that makes our lives easier. However, it is essential not to forget that functions always risk that the optimizer has no statistics.

Here is an example that shows precisely such a case. First, we will apply TO_CHAR on a date and take a look at the execution plan:

SELECT Col_1, Col_2 FROM 
DBNAME.TableNane 
WHERE TO_CHAR(Date_Col, 'mm-dd-yyyy') ='12-31-9999'
ORDER by 1;

COLLECT STATS COLUMN TO_CHAR(ENT_END_TS, 'mm-dd-yyyy') AS COL_DATE_UDF 
ON DBNAME.Tablename;

The below excerpt from the execution plan illustrates that in step 2, the optimizer has “No Confidence” despite having the statistics collected on the column used in the WHERE condition. The estimated number of rows is 1, which is wrong (the correct result is two rows):

Step 2:

Next we do an all-amp operations RETRIEVE step from dbname.tablename in view viewed.view_nm by way of an all row scan with a condition of "(TD_SYSFNLIB.To_Char(CAST((date_col in view viewing.viewnam.ENT_END_TD) as TIMESTAMP(6) WITH TIME ZONE) AT TIME ZONE INTERVAL '0:00' HOUR TO MINUTE, 'mm-dd-yyyy')) ='12-31-9999'") into Spool 1 (group_amps). which is built locally on the amps. Then we do a SORT to order Spool 1 by the sort key in spool field1. The Size of spool 1 is estimated with no confidence to be 1 row.

Let’s rewrite the query, and use a simple CAST:

SELECT Col_1, Col_2 FROM
DBNAME.TableNane WHERE CAST(CAST(ENT_END_TD as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) ='12-31-9999' ORDER by 1;

COLLECT STATS COLUMN CAST(CAST(ENT_END_TD as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) as COL_DATE_UDF on DBNAME.Tablename;
Next we do an all-amp operations RETRIEVE step from dbname.tablename in view viewed.view_nm by way of an all row scan with a condition of "(TD_SYSFNLIB.To_Char(CAST((date_col in view viewing.viewnam.ENT_END_TD) as TIMESTAMP(6) WITH TIME ZONE) AT TIME ZONE INTERVAL '0:00' HOUR TO MINUTE, 'mm-dd-yyyy')) ='12-31-9999'") into Spool 1 (group_amps). which is built locally on the amps. Then we do a SORT to order Spool 1 by the sort key in spool field1. The Size of spool 1 is estimated with High Confidence to be 2 rows.

As shown above, the optimizer can rate this step as “high confidence” because it has the statistics at its disposal. The TO_CHAR function from our first query has made it impossible to use statistics.

Conclusion

User-Defined Functions (UDF), such as the Teradata TO_CHAR function, should only be used if we cannot achieve the same result with “simple” means. Functions usually mean (with some exceptions like UPPERCASE) that existing statistics cannot be used. Therefore the optimizer has to rely on heuristics, which are typically the cause for bad execution plans.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

sp3006

  • Avatar
    Dieter Nöth says:

    This is a fine example of why end-users should be trained to write good SQL 🙂

    Instead of an explicit CAST/TO_CHAR one could simply use
    WHERE ENT_END_TD = DATE ’9999-12-31′

    This automatically casts the Timestamp to a Date.

    And changing the WHERE into a sargable condition would be even better:

    WHERE ENT_END_TD BETWEEN CAST(DATE ‘9999-12-31’ AS TIMESTAMP)
    AND TIMESTAMP ‘9999-12-31 23:59:59.999999’

    If it wasn’t for 9999-12-31 I would prefer:

    WHERE ENT_END_TD >= CAST(DATE ‘2016-01-08’ AS TIMESTAMP)
    AND ENT_END_TD < CAST(DATE '2016-01-08' +1 AS TIMESTAMP)

    • Thank you, Dieter, agree with you, thank you again.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >