Forum

Teradata TO_CHAR  

  RSS

sp3006
(@sp3006)
New Member
Joined: 5 years ago
Posts: 2
08/01/2016 4:08 pm  

The Teradata TO_CHAR Function

The Teradata TO_CHAR function can be used to:

  • Convert numeric expressions to character strings
  • Convert dates, times, interval data types, or timestamps to character strings

Here is an example of a numeric expression argument. It can be any of these data types: BYTEINT, SMALLINT, INTEGER, BIGINT, DECIMAL/NUMERIC, FLOAT/REAL/DOUBLE PRECISION, or NUMBER

The second argument is a format string (More about format strings you can find in the Teradata documentation). It has to be CHAR or VARCHAR.

SELECT TO_CHAR(123456789,'999999999');

Hint: If the query returns only '#' characters, this means that the conversion failed:

SELECT TO_CHAR(123456789,'99');
--> ###

Here is an example of a date expression argument. The first argument has to be one of the following data types: DATE, TIME, TIME WITH TIME ZONE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, or an Interval data type. The second argument has to be CHAR or VARCHAR:

SELECT TO_CHAR(DATE'2019-11-09', 'DD-MON-YYYY' );
--> '09-NOV-2019'

Teradata TO_CHAR and Statistic Estimations

Of course, we are all excited when Teradata provides new features with every new release that makes our lives easier. However, it is important not to forget that the use of functions always has the risk that the optimizer has no statistics available.

Here is an example that shows exactly such a case. First, we will apply TO_CHAR on a date and take a look at the execution plan:

SELECT Col_1, Col_2 FROM 
DBNAME.TableNane 
WHERE TO_CHAR(Date_Col, 'mm-dd-yyyy') ='12-31-9999'
ORDER by 1;

COLLECT STATS COLUMN TO_CHAR(ENT_END_TS, 'mm-dd-yyyy') AS COL_DATE_UDF 
ON DBNAME.Tablename;

The below excerpt from the execution plan illustrates that in step 2 the optimizer has "No Confidence" despite having the statistics collected on the column used in the WHERE condition. The estimated number of rows is 1, which is wrong (the correct result is 2 rows):

Step 2:

Next we do an all-amp operations RETRIEVE step from dbname.tablename in view viewed.view_nm by way of an all row scan with a condition of "(TD_SYSFNLIB.To_Char(CAST((date_col in view viewing.viewnam.ENT_END_TD) as TIMESTAMP(6) WITH TIME ZONE) AT TIME ZONE INTERVAL '0:00' HOUR TO MINUTE, 'mm-dd-yyyy')) ='12-31-9999'") into Spool 1 (group_amps). which is built locally on the amps. Then we do a SORT to order Spool 1 by the sort key in spool field1. The Size of spool 1 is estimated with no confidence to be 1 row.

Let's rewrite the query, and use a simple CAST:

SELECT Col_1, Col_2 FROM
DBNAME.TableNane WHERE CAST(CAST(ENT_END_TD as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) ='12-31-9999' ORDER by 1;

COLLECT STATS COLUMN CAST(CAST(ENT_END_TD as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) as COL_DATE_UDF on DBNAME.Tablename;

Next we do an all-amp operations RETRIEVE step from dbname.tablename in view viewed.view_nm by way of an all row scan with a condition of "(TD_SYSFNLIB.To_Char(CAST((date_col in view viewing.viewnam.ENT_END_TD) as TIMESTAMP(6) WITH TIME ZONE) AT TIME ZONE INTERVAL '0:00' HOUR TO MINUTE, 'mm-dd-yyyy')) ='12-31-9999'") into Spool 1 (group_amps). which is built locally on the amps. Then we do a SORT to order Spool 1 by the sort key in spool field1. The Size of spool 1 is estimated with High Confidence to be 2 rows.

As we can see, the optimizer can now rate this step as "high confidence" because it has the statistics at its disposal. The TO_CHAR function from our first query has made it impossible to use statistics.

Conclusion

User-Defined Functions (UDF), such as the Teradata TO_CHAR function, should only be used if we cannot achieve the same result with "simple" means. Functions usually mean (with some exceptions like UPPERCASE) that existing statistics cannot be used!

This topic was modified 3 weeks ago by DWH Pro Admin
This topic was modified 2 weeks ago 29 times by DWH Pro Admin
This topic was modified 5 days ago by DWH Pro Admin

Quote
Dieter Nöth
 Dieter Nöth
(@Dieter Nöth)
Guest
Joined: 4 years ago
Posts: 2
08/01/2016 10:39 pm  

This is a fine example why end users should be trained to write good SQL 🙂

Instead of an explicit CAST/TO_CHAR one could simply use
WHERE ENT_END_TD = DATE ’9999-12-31'

This automatically casts the Timestamp to a Date.

And changing the WHERE into a sargable condition would be even better:

WHERE ENT_END_TD BETWEEN CAST(DATE '9999-12-31' AS TIMESTAMP)
AND TIMESTAMP '9999-12-31 23:59:59.999999'

If it wasn't for 9999-12-31 I would prefer:

WHERE ENT_END_TD >= CAST(DATE '2016-01-08' AS TIMESTAMP)
AND ENT_END_TD < CAST(DATE '2016-01-08' +1 AS TIMESTAMP)

This post was modified 6 days ago by DWH Pro Admin

ReplyQuote
sp3006
(@sp3006)
New Member
Joined: 5 years ago
Posts: 2
08/01/2016 10:56 pm  

Thank you Dieter, agree with you, thank you again.


ReplyQuote
DWH Pro Admin
(@oshun123)
Member Admin
Joined: 5 years ago
Posts: 81
09/01/2016 9:58 am  

Thank you for this great post!

Roland


ReplyQuote
Share:
>

Please Login or Register