Teradata strtok_split_to_table, CSVLD, Unpack: Column Splitting in Teradata

Splitting column content can be a challenging task. Teradata provides multiple methods to achieve this, each with unique pros and cons. In this article, we will examine these techniques in detail. To begin, we will generate a table containing sufficient random data to assess every option’s efficiency and resource utilization.

CREATE TABLE DWHPRO.Split
(
    PK INTEGER,
    ToBeSplit   VARCHAR(200)
) PRIMARY INDEX (PK);
INSERT INTO DWHPRO.Split
SELECT 
	ROW_NUMBER() OVER (ORDER  BY 1) AS PK,
	'Row' || TRIM(PK) || 'Col1,' || 'Row' || TRIM(PK) || 'Col2,' || 'Row' || TRIM(PK) || 'Col3'
FROM 
(
	SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
	UNION ALL SELECT * FROM SYS_CALENDAR.CALENDAR
) t01;

The Teradata “CSVLD” Function

This function requires all defined columns to be present. An error message will be shown if any columns cannot be created.


SET QUERY_BAND = 'Method=CSVLDFunc;' FOR SESSION;
SELECT * FROM TABLE (CSVLD(Split.ToBeSplit, ',', '')
RETURNS (col1 varchar(200), col2 varchar(200), col3 varchar(200))) as T1;

The Teradata “Unpack” Function

In contrast to the previously discussed function, the Unpack function also allows you to omit rows that cannot be split into all defined columns. The parameter “IgnoreInvalid” is used for this purpose:


SET QUERY_BAND = 'Method=UnpackFunc;' FOR SESSION;
SELECT 
	PK,
	CAST(Col1 AS VARCHAR(255)),
	CAST(Col2 AS VARCHAR(255)),
	CAST(Col3 AS VARCHAR(255))
FROM Unpack (
  ON Split
  USING
  TargetColumn ('ToBeSplit')
  OutputColumns ('Col1', 'Col2', 'Col3')
  OutputDataTypes ('varchar', 'varchar', 'varchar')
  Delimiter (',') 
  Regex ('(.*)')
  RegexSet (1)
  IgnoreInvalid ('true')
) AS t01 
;

The Teradata “strtok_split_to_table” Function

This function splits data into columns and assigns each column a number for identification. Additionally, it creates a new row for each column.

SET QUERY_BAND = 'Method=strtok_split_to_tab;' FOR SESSION;
WITH T1 as
( 
 SELECT d.* FROM TABLE (strtok_split_to_table( Split.PK, Split.ToBeSplit,  ',')
RETURNS (PK integer, tokennum integer, Col varchar(100)) ) as d
)
SELECT T1.* FROM t1
;
SELECT QUERYBAND,Spoolusage,TotalIOCount,AMPCPUTIme 
FROM DBC.DBQLOGTBL
WHERE QUERYBAND LIKE '%Method%'

Performance Considerations

Only the Teradata functions “CSVLD” and “Unpack” produce one output row for each input row, making them directly comparable. In contrast, “strtok_split_to_table” multiplies the number of rows and requires additional spool space.

Based on our empirical measurements (each query was executed three times), the Unpack method necessitates approximately 43% less spool space and 59% fewer I/Os.

QuerybandSpoolUsageTotalIOCountAMPCPUTime
=S> Method=CSVLDFunc;110 559 232332,001,89
=S> Method=CSVLDFunc;110 559 232332,001,89
=S> Method=CSVLDFunc;110 559 232332,001,90
=S> Method=strtok_split_to_tab;245 719 040608,004,71
=S> Method=strtok_split_to_tab;245 719 040608,004,72
=S> Method=strtok_split_to_tab;245 719 040608,004,98
=S> Method=UnpackFunc;63 406 080136,001,90
=S> Method=UnpackFunc;63 406 080136,001,79
=S> Method=UnpackFunc;63 406 080136,001,87

Discover this fascinating article on Stack Overflow.

https://stackoverflow.com/questions/68931466/teradata-splitting-field-into-several-rows-using-strtok-split-to-table-for-b

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.