Teradata Identity Columns
What are Teradata Identity Columns?
The identity column feature allows generating unique numbers for each row inserted into a column. Identity columns can be uses for transactional or bulk inserts.
By using Identity Columns, we can reduce costs because column uniqueness is ensured without the overhead caused when using a unique constraint.
Furthermore, no error-prone generation of unique values on application level is required.
If the Identity Column is used as the Primary Index of a table, it ensures an even distribution of rows across all AMPs.
When initially bulk-loading into a table with an Identity Column, there is some overhead involved, as each vproc has to reserve a range of numbers from DBC.IdCol, and add them to its local cache. But this overhead is only happening initially, as the later handling of numbers is done from the cache.
Only when a vproc runs out of numbers, it will get another range of numbers.
As vprocs are working in parallel, and independently from each other, the numbers generated and inserted into a table are not in chronological order. There can be gaps.
Bulk insert performance can be optimized, by increasing the DBSControl setting IdColBatchSize, which decreases the number of timed the DBC.IdCol tables have to be accessed to get a new range of numbers.
Still, this will increase the likely-hood of having larger gaps in the figures in the case of a system restart, and between loads, when not all numbers are consumed.
The default batch size for number ranges is 100.000.
Useful Identity Column Options