What are Surrogate Keys?
Surrogate keys are a concept used in data warehousing, to base the primary keys of a table on generated and unique artificial values (usually integer values) instead of using the columns from the source system to identify the rows in a table unambiguously (in database terminology this is called a natural key).
Why should you use Surrogate Keys?
There are several reasons why it is a good idea to replace the natural keys with Teradata surrogate keys.
At first, you may have more than one source system delivering the same value in the natural key for the same target table. This fact forces you to use surrogate keys because keeping the natural key from both sources would not allow you to distinguish the objects.
Likewise, you may have the opposite case:
You may have two source systems delivering the same object, but with different natural keys.
Imagine, for example, a company where potential clients are initially recorded in a database for possible customers, being moved to the production billing system, as soon as they signed a contract with this company.
Both operational systems may have their natural keys to identify the same client. Typically, in a data warehouse, you want to keep track of the complete customer life cycle. Therefore you need one key to identifying this client and don’t care about the operational system where this client was handled.
Unfortunately, many Teradata data warehouse implementations decide against surrogate keys and put the effort of integrating the information into the reports.
This approach will ultimately cause more costs than having a right surrogate key design from the begin (don’t forget, such an approach requires to repeat the logic of combining object information into each and every single report).
Even worse, if the natural key names are taken from the source system, and you undergo a replacement of the origin system. You never should tightly couple your data model to the system of origin structures. An invoice table should be called Invoice and not something like SAP-Invoice.
Initially it is always faster to take a pass on surrogate keys and integrate source system tables 1:1, but believe me, you just move the costs to a later stage in the project. Be prepared for a costly redesign.
Apart from making you independent from source system influences, surrogate keys, if properly used, may increase your performance as you can replace several natural key columns by an integer column, which can save space and decrease disk IOs.