This post aims to compile all crucial aspects to be considered while constructing a Teradata Data Warehouse, including the ETL process and SQL queries.
This list is just the beginning, and I anticipate receiving valuable feedback from my readers to expand it in the future. Initially, I have provided a few concepts, but I intend to include more categories and topics in the coming days and weeks. Furthermore, you can contribute your ideas at the end of this blog post, which I will continuously incorporate into the article, along with your name if desired.
The Teradata Physical Data Model
- Are all Primary Key columns defined as NOT NULL?
- Does the Primary Index distribute the rows equally across all AMPs?
- Are the Primary Index column values stable, or do they change frequently?
- Do columns with the same content have the same data type across all tables?
- Are SET tables only used for exceptional cases, otherwise always MULTISET tables?
- Are tables from which a range of values is often queried row-level partitioned?
- Do tables that are often joined together have the same primary index?
- Is the primary index of a table often used in a WHERE condition?
- Was Multi-Value compression performed on all tables?
- Are all row partitions maintained (to cover new values)?
- Is the correct CHARACTER SET used (LATIN vs. UNICODE)?
- Is the DATABLOCKSIZE appropriate for the workload (Tactical vs. Strategic)?
- Are tables created without the FALLBACK option? (Raluca Topor)
- Was a 1-AMP design chosen for tiny tables (1-AMP Hashmap)? (Peter Nagy)
- Was columnar considered for wide datamarts with narrow queries? (Peter Nagy)
- Was NOT NULL unique key applied on parent tables to enable join elimination? (alternative to soft RI) (Peter Nagy)
- Was primary AMP design considered for tables that are frequently reloaded and yet still joined? (Peter Nagy)
- Is adequate sample percentage used during collect stats for big tables? (Peter Nagy)
- Are statistics copied from table to table whenever it’s a replica instead of recollected? (Peter Nagy)
- Is a value-ordered NUSI considered for range query columns on tables where row partitioning is not an option? (Peter Nagy)
The Teradata Statistics
- Are full statistics collected on all small tables (fewer table rows than AMPs)?
- Are statistics collected on all non-indexed columns that are used in WHERE conditions?
- Are full statistics collected on all skewed columns?
- Do you use DELETE instead of DROP/CREATE to preserve statistics histories?
- Have you collected statistics on all join columns?
- Have you collected statistics on all Primary Indexes?
- Have you collected statistics on all NUSIs?
- Have you collected statistics on all partition columns (individually and in combination)?
- Are you using Create Table Table1 as Table2 with data and stats to keep the stats? (Atharva Joshi)
Teradata SQL
- Do you use UNION only if you need to remove duplicates, otherwise, UNION ALL?
- Do you only select the columns you really need?
- Do you use outer joins only when it is really necessary?
- If possible, do you use WHERE conditions to enable partition elimination?
- Do you use multi-statement requests where possible?
- Do you use CLOB instead of large VARCHAR for rarely requested columns?
- Are you careful not to use functions on join columns?
- Are you using a Fast Path INSERT/SELECT?
Please Send Me Your Own Tuning Ideas
Please add this tuning idea!
Your email address will not be utilized for marketing or stored.