Teradata Statistics MaxValueLength
What Is MaxValueLength Used For?
Sometimes it is necessary to collect statistics on character columns that are very wide, such as VARCHAR(1000).
Until Teradata Release 13.10, only the first 16 bytes of the column was used to create the statistics histograms, the rest were simply ignored and truncated.
For example, the value “012345678901234567890” was considered equivalent to the value “0123456789012345”, which resulted in both generating identical statistical information.
In the end, this led to incorrect statistics, and when problems with the statistics occurred, it was extremely difficult to find the cause in this fact.
How MaxValueLength Improves Teradata Statistics
To alleviate this problem, Teradata has introduced a new feature that allows the user to specify how many bytes or characters should be taken into account when creating statistics histograms: MaxValueLength
Specifies the maximum size for histogram values such as MinValue, ModeValue, MaxValue, etc.
For statistics on CHARACTER and VARCHAR columns, the number of characters is specified, for all other data types the number of bytes.
The number of bytes or characters considered can be increased with the following option when collecting statistics. For example, if you want 100 bytes to be included (assuming CustomerId is an INTEGER), you only need to execute the following command:
COLLECT STATISTICS USING MAXVALUELENGTH 100 COLUMN (CustomerId) ON Customer;