Teradata Performance Optimization – Part 1 (Introduction)
Teradata Performance optimization can be a very time consuming task with sometimes unpredictable outcome. Before starting any optimization activities, it makes sense to get a clear view about the areas which could be subject to optimization.
Being the creative head in several performance optimization projects during the last decade, I was able to gather a lot of experience and knowledge. One point I always strongly emphasize is the importance of a clear picture about the expected results.
Not defining clear goals makes it difficult or even impossible to agree upon achievements with all participants.
In my professional career I was always confronted with statements like “this query is too slow”, “this report does not perform well” .
Believe me, this is no sound basis at all to set up a serious performance optimization strategy. It always will lead to misunderstandings and endless discussions. It is your task as a performance analyst to define feasible measures and convince your client to rely on them and stay with them as the project evolves.
But what kind of measures do we have? For your client, probably there exists only one practical measure: run time.
I have to admit that this is what really counts at the end of the day.
Still, my experience is, that run times are by far not enough. Keep in mind that there may be several projects on the same Teradata System and run times may fluctuate heavily depending on the workload. Even worse, considering the comprehensive workload management possibilities on a Teradata System, your workload could even be blocked on purpose i.e not running at all. Imagine the disaster if you improved, such as, a report and while you present your achievements to the customer, the reporting SQL gets blocked. I assume this would be quite embarrassing, right?
Even if it is much more difficult to sell to the client, I prefer to measure performance improvements based on absolute measures.
My favorites are CPU usage and disk accesses. Maybe bottleneck-related measures make more sense, utility slots . You have to find out what works best in each situation.
Although run times have highest priority at the end of the day, you will need absolute measures as an argument to continue your improvement efforts. Don't forget, in the Teradata arena there are always people which would like to see you fail. If you can't prove your improvements, you are an easy victim. Again: use absolute measures and let run times come afterwards.
Before starting any activities: Get the big picture!
Start working on the areas which you assume to bring the most improvement, if feasible from a cost point of view. Don't let yourself be steered by the client, based on exclamations like: “This query is slow, you should take a look at it”.
Although probably uttered with good intentions, such suggestions are not helpful. Get a clear picture on your own. Don't let yourself being mislead by assumptions others make.
Analyze each area which could be optimized, create a priority list, line out the related costs and resources needed. If no cost or resource restrictions apply, go for the highest priority change, otherwise decide accordingly.
Conclusion: Never ever start any optimization task without a clear picture about client's expectations and/or an agreed method how to measure improvements, otherwise you will end up in discussion or a never ending loop of improvement attempts.
In the second part of the performance optimization series, I will take a closer look about possible areas of improvement on a typical Teradata Data Warehouse.