Teradata Spool Space Introduction
What is Teradata Spool Space?
Teradata Spool Space is a workspace that Teradata uses to store temporary and final results sets. Spool Space is only used temporarily and released when it is no longer needed or automatically when the session ends. There are four types of Spool Space, which we will explain at the end of this article.
How can we increase the spool space in Teradata?
It is always essential to have a spool reserve. Options to make more Teradata Spool Space available to the system are compressing common values, removing fallback protection, and removing indexes. Finally, there is always the possibility of a system upgrade.
What should I do if my system no longer has enough space?
Here the simple rule applies: it is better to be out of spool than out of permanent space because if there is not enough permanent space available, it will typically be a DML statement, which leads to a rollback. SELECT statements mainly use Spoolspace, and rollbacks are relatively rare.
How is Teradata Spool Space assigned to a User?
The spool space can be assigned when we create a user:
CREATE USER DWHPRO_ADMIN FROM DWHPRO AS
PERMANENT = 800000000 BYTES,
SPOOL = 200000000 BYTES,
PASSWORD = ******;
The assigned spool space represents the upper limit for this user. The spool space is distributed evenly among all AMPs.
I get the Error Message “No more spool space in user…” although there should be enough space available. Why?
Since Teradata divides the total spool space by the number of AMPs, each AMP has only a fraction of the space available.
If a query is skewed, it can happen that an AMP does not have enough space available, and Teradata aborts the request.
As a general rule, the more AMPs a system has, the greater the risk.
How is Spool Space managed internally by Teradata?
It is essential to understand that Teradata never simultaneously holds data and spool space on one cylinder. Therefore, cylinders are released immediately as soon as Teradata no longer needs them for spool space.
How can I find out how much Spool Space my User has?
SELECT * from DBC.DATABASES
Why is there even a Teradata Spool Space Limit?
This limit is needed to cancel requests with too high a consumption of resources. Even if high spool space usage does not always mean a bad query, the correlation between bad queries and high spool space usage is high.
What “no more spool space” Scenarios are there?
One reason might be that there are no more free cylinders on the Teradata system. We must prevent the situation where all cylinders are full at all costs. Usually, the DBA creates a separate database on each Teradata system to reserve or block space to avoid the system running out of free cylinders.
“No more spool space” can also be caused by too many parallel sessions of the same user.
Finally, “no more spool space” can also be triggered by skewed queries or volatile tables.
What must we consider for NOPI Teradata tables?
NOPI tables are ideal for improving loading times with bulk load utilities. But the transactional INSERT INTO statement can create skewed tables as rows AMP locally, which can cause a no more spool space problem. We can avoid skew by redistribution of the rows with HASH BY RANDOM. The second Explain Plan in the figure below distributes the rows equally to all AMPs (“redistributed randomly to all AMPs”) in Step 1:
Why do I have a “no more spool space” problem after a Teradata upgrade, although the query worked before?
Such an issue can happen if the number of AMPs is higher than before after the upgrade, but the disk space is not scaled proportionally. Then there is less disk space available for each AMP. The following query can check the number of AMP:
SELECT MAX(VPROC) FROM DBC.TableSizeV;
What are Phantom Spool and Leftover Spool?
For Phantom Spool, the table DBC.DatabaseSpace shows that the spool is in use, but Teradata occupies no space on the disk.
The leftover spool is the Teradata spool space that remains after a request has ended.
How can Phantom Spool Space be released?
The easiest way (without using the console) is to execute the following procedure:
call SYSLIB.FixCurrentSpace(‘S’, ‘Y’, ‘N’, null, cnt, err);
We can use this procedure to solve phantom spool problems and correct inconsistencies in the DBC.DATABASESPACE table.
4 Types of Teradata Spool Space
It will be released as soon as it is no longer needed. Intermediate Spool Space is required by derived subqueries, for example. The execution plan provides information about when spool space is released.
The Output Spool Space holds the result of an SQL request.
Teradata uses volatile spool space for volatile tables.
Persistent Spool Space survives a system restart or the crash of a node. We can define the critical workload per session or the whole system.
Solve The Teradata Spool Space Problem – Failure 2646 No more spool space
As already mentioned, “no more spool space” can be triggered by different scenarios. Here are some ideas on how to avoid this problem:
Solve The Teradata Spool Space Problem – Failure 2646 No more spool space
Ensure up-to-date statistics
Current statistics are essential so that the optimizer does not copy large amounts of data between the AMPs.
Choose a good Primary Index.
Choose a primary index for your tables that distributes the rows evenly across all AMPs.
Prevent “no more spool space” on the system level
Prevent the system from getting into a situation where spool space is no longer available (e.g., through a spool reserve database).
Release Spool Space as soon as possible
Drop Volatile Tables immediately when they are no longer needed.
Often you rely on this to happen automatically at the end of the session.
Not dropping the volatile tables immediately unnecessarily blocks spool space (and cylinders that Teradata could use as permanent space).
Use Multivalue Compression
Multivalue compression allows the optimizer to keep data blocks compressed in the main memory. Therefore it is a suitable tool to reduce the required spool space. It makes sense, especially for large tables with few different values or wide columns.
Use the appropriate data types
Data types should be chosen to cover all domain values but do not require unnecessary space. If a BYTEINT is sufficient, then, e.g., we should define no INTEGER column because this needs four times as much space in the spool space (if the column was not compressed). Instead of a CHAR column, a VARCHAR column should be used to save spool space (again, this is relevant if the column was not compressed).
Primary Index Access instead of Full Table Scans
The workload should be designed so that primary index accesses dominate. Above all, UPI or USI access ensures that no spool space is required. Often the PDM is the best tool to prevent spool space problems. Row Partitioning helps to reduce full table scans to specific partitions.
Use the DATABLOCKSIZE parameter to create smaller data blocks
If our workload is primarily tactical, we can use smaller data blocks to prevent the spool needed.
Use columnar table design
If the access pattern is suitable columnar tables can reduce the needed spool space enormously because only needed columns have to be spooled. However, columnar tables must be analyzed precisely and adapted to the workload. We can also achieve a similar result by reducing the number of columns in a table (vertical partitioning).
If you run out of permanent space, we describe here which new features in Teradata help to avoid acute space problems (e.g., caused by skew):
Here is a link to the Teradata Spool space documentation: