fbpx

Teradata Sample Statistics

By Roland Wenzlofsky

December 16, 2019


Introduction to Teradata Sample Statistics

In this article, I will show you when, how, and why you can/should use Teradata Sample Statistics.

What are Teradata Sample Statistics?

Sample statistics take only a percentage of rows into account when producing estimates.

What are the Advantages of Sample Statistics?

Sample Statistics are used for performance tuning. Collecting full statistics requires that a full table scan be performed on the table. Then the rows must be sorted to count the number of different values.
Sample statistics can prevent this.

When is it worth using Sample Statistics?

If the tables are large and the resource consumption causes performance problems by collecting the statistics.
The rows should be evenly distributed across all AMPs.
Especially suitable for this are e.g., indexed columns. Below is a

When shouldn’t you consider Sample Statistics?

Full statistics should always be collected on small tables. Sample statistics are not a general substitute for full statistics! The figure shows a skewed table that is not suitable for sample statistics:teradata-sample-statistics-skew

How is the Size of the Sample determined?

With SAMPLE or SYSTEM SAMPLE, the sample size is determined by the optimizer. The first time full statistics are collected, then the optimizer switches to Sample Statistics and reduces the sample size until it detects that Skew is a problem.
If you want to determine the size of the sample yourself, use COLLECT STATISTICS USING SAMPLE n PERCENT.

Do I have to specify the desired Collection Method when the Statistics are collected again?

The simple COLLECT STATISTICS without specifying the method again is sufficient. Teradata remembers the last used method (full or sample) and the sample size for the sample statistics.

When using sample statistics, the columns must have as many different values as possible. For example, a UPI fulfills this requirement, but the NUPI can also be used if it has as many distinct values as possible.

Unusable for sample statistics are skewed columns

The Syntax for Sample Statistics

Use the sample size determined by Teradata:

COLLECT STATISTICS USING SYSTEM SAMPLE COLUMN (CustomerId) ON Customer;

Use a sample size of 5% of the rows of the table customer

COLLECT STATISTICS USING SYSTEM 5 PERCENT COLUMN (CustomerId) ON Customer;

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>