All the relational algebra operations, such as Select, Intersection, Product, Union, Difference, Project, Join, Division, Merge, etc. can also be performed on the Relational Database Model. Operations on the Relational Database Model are facilitated with the help of different conditional expressions, various key attributes, pre-defined constraints, etc. For example selection of information of the customer, who is living in some city for more than 20 years.
Constraints: Constraints are logic rules that are used to ensure data consistency or avoid certainly unacceptable operations on the data.
Relational Model: Summary
This data model is used by any business that sells the products above and services, including banks. Insurance Companies, credit card companies, and brokerages.
A comprehensive Plan of Implementation is Drawn-out by the Design Architects, for the implementations
Mapping the Source Data columns to the Destination Data columns
Converting the Logical Model into Physical Data Model by Modeling Tools
Cost-cutting, unfortunately, leads to the unpleasant situation that many clients save on the wrong side, starting Data Warehouse projects where they should finish them.
As a Teradata Performance Optimization Analyst, you probably will be confronted with scenarios where there is no maintained data model at all, unclear business specifications, unyielding and incomplete mappings. You can consider yourself lucky for every piece of information you can squeeze out of your customer’s working environment.
In my experience, most of the time you are not in a position being responsible for performance. This task is unfortunately often deemed as being an exclusively technical task.
However, although several performance problems can be fixed providing purely technical expertise, you will probably end up with a lot of workarounds not correcting the root causes of your performance problems.
Despite such a purely technical approach may be applied for a very long time, adding up the costs at the end of the day will leave you in shock! Unfortunately this head-in-the-sand politics is daily routine in many companies.
As a performance specialist, you have to be business analysts, data modeller and developer at once. I think this is a very important insight: Performance Optimisation often evolves into fixing the broken data model.
It is your task to make good for the failures in all these areas of expertise in the past. Try to contact people involved or responsible for these roles in the past at this client.
Sad to say, often you will be confronted with uncooperative behaviour as people tend to defend their petty areas of expertise. At the end of the day, you are questioning their work results from the past. You are on a very delicate mission. Strong management support would probably make your life easier.
Fixing a poorly designed data model is like a time travel. You have to go back to the start of the project, get to know the original business requirements, question how they have been transformed into the existing data model and why. You have to evolve a behaviour of questioning all past decisions done.
Most times it was for lack of budget, a wrong assignment of people to roles, lack of time or simply missing experience that turned the project into a big mess and a failed state. Still, in my opinion, the most outstanding cause is over-specialization of project members. Over-Spezialisation leads to evaporation of responsibilities. Everybody is shifting problems back and forth, lots of resources are wasted finding the responsible person until a problem finally is solved.
One approach I prefer, if I know that only a redesign of the data model can definitely solve performance issues, is to create a small prototype, demonstrating improvements.
I would take an assessable subject area and redesign the chain. Using such a prototype as a communication tool can make the difference between getting the chance of fixing the performance problem or just getting an answer like “would be nice, but we don’t have the budget”. The more tangible your approach, the better.
As always, success has to be measurable. Many times, reporting will be at the end of the chain and making some reports performing better on top of your prototype would be a good starting point for showing your expertise in Teradata performance optimization.
I hope the main messages from this article is clear:
These days, being just a highly specialist developer is not enough. As an performance specialist you have to understand the data warehouse life cycle.
Welcome to the second part of the Teradata performance optimization series. Today we will analyze one of the leading causes of performance issues on a Teradata Data Warehouse system (although most of this probably is valid for any Data Warehouse).
Most likely, a good data model is the fundament to prevent many future performance issues early in a Data Warehouse project.
I dare to argue: from a technical point of view most projects fail or are in poor shape due to clients with the limited budget are saving at this point because creating a good data model does not immediately deliver visible results but immediately causes costs.
Good data modelers are seldom. It is much cheaper to engage a developer, educated in quickly moving data into the database, than paying a data modeler to paint “boxes and lines” …
This approach may be tempting because first results are available soon and additionally developers are the cheapest people in the data warehouse job hierarchy. In short time you can bear fruit to your customer. He will be happy as he can assume his money is not wasted…
Some years ago, when the worldwide economic crisis struck daily rates, the data warehousing industry needed a marketing name for such a “reduced quality” approach.
Prototyping was the new buzz word.
Although in principle a good idea, most prototypes end up as the final solution. While initially prototypes were thought to be the communication link between the customer and base for further analysis and requirement specification, they often ended up being final solution and blind alley at once.
I suppose the worst decision that can be taken during the modeling process is to take over directly the source definitions from the operational systems.
Integrating source systems 1:1 leads to operational data stores but has nothing to do with data warehousing. Combine this with the waiving of surrogate keys, and you can be sure that sooner or later a source system will be replaced by another one. At this time, a cost-intensive redesign will wait for you.
Project costs have just been shifted to a later date, and they probably are many times higher now compared to the initial costs and experienced data modeler would have caused.
Unfortunately, this is the time we live in, and we have to arrange somehow with this situation.
Personally, I decided to opt out of this undesirable development of “low quality” data warehousing. You can do it like I did: don’t compete against the crowd, fighting for projects in an environment of decreasing daily rates, but wait for the smashed projects to come to you.
The next part of this series will go into more detail and discuss the possibilities we have in fixing a broken data model.