Welcome to the second part of the Teradata performance optimization series. Today we will analyze one of the leading causes of performance issues on a Teradata Data Warehouse system (although most of this probably is valid for any Data Warehouse).
Most likely, a good data model is the fundament to prevent many future performance issues early in a Data Warehouse project.
I dare to argue: from a technical point of view most projects fail or are in poor shape due to clients with the limited budget are saving at this point because creating a good data model does not immediately deliver visible results but immediately causes costs.
Good data modelers are seldom. It is much cheaper to engage a developer, educated in quickly moving data into the database, than paying a data modeler to paint “boxes and lines” …
This approach may be tempting because first results are available soon and additionally developers are the cheapest people in the data warehouse job hierarchy. In short time you can bear fruit to your customer. He will be happy as he can assume his money is not wasted…
Some years ago, when the worldwide economic crisis struck daily rates, the data warehousing industry needed a marketing name for such a “reduced quality” approach.
Prototyping was the new buzz word.
Although in principle a good idea, most prototypes end up as the final solution. While initially prototypes were thought to be the communication link between the customer and base for further analysis and requirement specification, they often ended up being final solution and blind alley at once.
I suppose the worst decision that can be taken during the modeling process is to take over directly the source definitions from the operational systems.
Integrating source systems 1:1 leads to operational data stores but has nothing to do with data warehousing. Combine this with the waiving of surrogate keys, and you can be sure that sooner or later a source system will be replaced by another one. At this time, a cost-intensive redesign will wait for you.
Project costs have just been shifted to a later date, and they probably are many times higher now compared to the initial costs and experienced data modeler would have caused.
Unfortunately, this is the time we live in, and we have to arrange somehow with this situation.
Personally, I decided to opt out of this undesirable development of “low quality” data warehousing. You can do it like I did: don’t compete against the crowd, fighting for projects in an environment of decreasing daily rates, but wait for the smashed projects to come to you.
The next part of this series will go into more detail and discuss the possibilities we have in fixing a broken data model.