Learn Essential Teradata SQL Statements And Functions For Data Warehousing And Analytics

Teradata is an RDBMS that provides high performance and scalability for data warehousing and analytics. Familiarity with its SQL dialect, including standard SQL and specific extensions, is crucial. This article outlines the essential SQL statements and functions for beginners and experienced users.

Table of Contents

Teradata SQL Statements

1. CREATE TABLE

The CREATE TABLE statement is used to create a new table in Teradata. The syntax is similar to standard SQL but with some Teradata-specific options:

CREATE TABLE table_name (
column_name data_type,
…
);

2. INSERT INTO

The INSERT INTO statement is used to insert new rows into a table. You can either insert a single row or multiple rows at once:

INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);

3. SELECT

The SELECT statement is used to retrieve data from one or more tables. You can specify the columns you want to retrieve, use aliases, and apply various conditions to filter the data:

SELECT column1, column2, …
FROM table_name
WHERE condition;

4. UPDATE

The UPDATE statement is used to modify existing rows in a table. You can update one or more columns and use a WHERE clause to apply conditions:

UPDATE table_name
SET column1 = value1, column2 = value2, …
WHERE condition;

5. DELETE

The DELETE statement is used to remove rows from a table. You can use a WHERE clause to apply conditions and delete specific rows:

DELETE FROM table_name
WHERE condition;

Teradata SQL Functions

Teradata SQL offers a variety of functions that allow you to perform calculations, manipulate strings, and work with dates and times. Here are some essential functions you should know:

1. Aggregate Functions

Aggregate functions perform calculations on a set of values, typically used with the GROUP BY clause. Some common aggregate functions include:

COUNT(): Counts the number of rows in a result set.
SUM(): Calculates the sum of values for a given column.
AVG(): Calculates the average of values for a given column.
MIN(): Finds the minimum value of a column.
MAX(): Finds the maximum value of a column.

2. String Functions

String functions help you manipulate and work with text data. Some important string functions include:

SUBSTRING(): Extracts a substring from a string.
TRIM(): Removes leading and/or trailing characters from a string.
UPPER(): Converts a string to uppercase.
LOWER(): Converts a string to lowercase.
POSITION(): Finds the position of a substring within a string.

3. Date and Time Functions

Date and time functions allow you to work with date and time values, perform calculations, and format them as needed. Some key functions are:

CURRENT_DATE: Returns the current date.
CURRENT_TIME: Returns the current time.
CURRENT_TIMESTAMP: Returns the current timestamp.
EXTRACT(): Extracts a specific part of a date or time value, such as the year, month, or day.
DATEADD(): Adds or subtracts a specific interval, date, or time value.

4. Numeric Functions

Numeric functions help you perform various mathematical operations and calculations on numeric data. Some commonly used numeric functions include:

ABS(): Returns the absolute value of a number.
ROUND(): Rounds a number to a specified number of decimal places.
CEILING(): Returns the smallest integer value greater than or equal to a given number.
FLOOR(): Returns the largest integer value less than or equal to a given number.
POWER(): Raises a number to a specified power.
SQRT(): Returns the square root of a number.
MOD(): Returns the remainder of a division operation.

5. Conversion Functions

Conversion functions allow you to convert data from one data type to another, which can be useful when working with different data types or formatting data for output. Some important conversion functions are:

CAST(): Converts a value from one data type to another.
TO_CHAR(): Converts a date, time, or numeric value to a string.
TO_DATE(): Converts a string to a date value.
TO_NUMBER(): Converts a string to a numeric value.

6. Window Functions

Window functions enable you to perform calculations across a set of rows related to the current row, without needing a GROUP BY clause. Some useful window functions include:

ROW_NUMBER(): Assigns a unique number to each row within the result set.
RANK(): Assigns a unique rank to each row within the result set based on the specified order.
DENSE_RANK(): Assigns a unique rank to each row within the result set, with no gaps in the ranking values.
NTILE(): Divides the result set into a specified number of groups and assigns a unique group number to each row.
LEAD(): Accesses data from a subsequent row in the result set.
LAG(): Accesses data from a previous row in the result set.

7. Teradata-Specific Functions

Teradata provides some unique functions that are specific to its database platform. These functions can be particularly helpful when working with Teradata data and can enhance your SQL capabilities. Here are some Teradata-specific functions that start with the prefix “TD”:

TD_SYSFNLIB.TD_UNPIVOT(): The UNPIVOT function allows you to transform columns into rows, effectively rotating your data. This function can be useful for normalizing data structures or creating more manageable views of your data.
TD_SYSFNLIB.TD_MAPS(): The TD_MAPS function is designed for processing and analyzing geospatial data. It can be used to calculate the distance between two geographic points, determine if a point is within a specified area, or perform other spatial operations.
TD_SYSFNLIB.TD_WIDTH_BUCKET(): The TD_WIDTH_BUCKET function divides a range of values into a specified number of equal-width intervals or “buckets.” This function can help you create histograms or analyze the distribution of data values.
TD_SYSFNLIB.TD_NORMALIZE() and TD_SYSFNLIB.TD_DENORMALIZE(): These functions are used to normalize and denormalize data. Normalization reduces redundancy and improves data integrity by eliminating duplicate data, while denormalization can improve query performance by reducing the number of joins needed to retrieve data.
TD_SYSFNLIB.TD_CARDINALITY(): The TD_CARDINALITY function estimates the number of distinct values in a column. This can help understand data distribution and plan indexing strategies.
TD_SYSFNLIB.TD_FIRST_VALUE(): The TD_FIRST_VALUE function retrieves the first value in an ordered set of rows. It operates as a window function and can be used to compare the first value in a group to other values in the same group.
TD_SYSFNLIB.TD_LAST_VALUE(): Similar to TD_FIRST_VALUE, the TD_LAST_VALUE function retrieves the last value in an ordered set of rows. It can be used to compare the last value in a group to other values in the same group.

By incorporating these Teradata-specific functions into your SQL queries, you can unlock even more powerful data manipulation and analysis capabilities on the Teradata platform. As always, practice is key to mastering these functions and maximizing your Teradata SQL skills.

8. Teradata Hashing Functions

Teradata uses hashing functions to distribute rows evenly across the system. These functions are integral to the Teradata architecture, ensuring optimal parallelism and performance. Here are some Teradata-specific hashing functions:

HASHROW(): The HASHROW function calculates the hash value of a given input value or set of input values. It returns a 32-bit hash value to determine the row’s location within the system. The input values are typically the primary index columns for a table. Example:

SELECT HASHROW(column_name) FROM table_name;
HASHBUCKET(): The HASHBUCKET function calculates the bucket number for a given hash value. This function is useful in understanding the data distribution across the system. It inputs the hash value (from HASHROW) and returns the bucket number. Example:

SELECT HASHBUCKET(HASHROW(column_name)) FROM table_name;
HASHAMP(): The HASHAMP function calculates the AMP number (Access Module Processor) for a given hash value or bucket number. It helps in determining which AMP is responsible for processing a specific row. Example:

SELECT HASHAMP(HASHBUCKET(HASHROW(column_name))) FROM table_name;

These Teradata hashing functions are critical in managing data distribution and ensuring the system operates efficiently. Understanding how these functions work can help you optimize your Teradata environment and improve query performance.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →