Zero-Copy Clones: A Space And Time-Saving Solution For Database Management With Snowflake

Q: What is the basis for zero-copy clones in Snowflake?

Snowflake exploits the immutability of Amazon S3 . Data is stored in micro partitions, which are S3 files. Since S3 files are not changeable, every DML statement that makes data changes requires a new S3 file to be created, which replaces the old one. Snowflake does not delete the original file but keeps it for some time (depending on which edition of the database is used).

Table of Contents

What Are Zero-Copy Clones?

We are all familiar with creating table backups before performing specific operations.

This is not a significant problem for smaller tables, as duplicates can be swiftly created and any errors can be rectified by reverting to the backup table.

If the table is enormous, making a copy can take a long time.
In addition, the required space is doubled. Especially at the go-live of new applications, this usually means an enormous expenditure of time and space.

Unfortunately, this approach is unavoidable in Teradata.

Let’s explore more effective database systems. For instance, consider Snowflake, which successfully resolves these issues.

What is the basis for zero-copy clones in Snowflake?

Snowflake exploits the immutability of Amazon S3. Data is stored in micro partitions, which are S3 files. Since S3 files are not changeable, every DML statement that makes data changes requires a new S3 file to be created, which replaces the old one.

Snowflake does not delete the original file but keeps it for some time (depending on which edition of the database is used).

How do zero-copy clones work in Snowflake?

Snowflake takes advantage of the need to replace entire S3 files when data is changed.

If a zero-copy clone of a table is created, the clone uses no storage because it shares the original table’s existing micro-partitions. Only pointers are set for the cloned table, pointing to the existing table’s micro partitions.

Data can then be inserted, deleted, or updated independently from the original table in the clone. Each change to the clone creates a new micro-partition owned by the clone.

Later changes in the original table are, of course, not reflected in the clone.

The following syntax is available in Snowflake:

CREATE TABLE CustomerCopy CLONE Customer;

What is the advantage of zero-copy clones over the traditional method of copying tables?

No additional space is required beforehand, and cloning is fast.
Snowflake claims a table with one terabyte can be cloned in 7 minutes in a small warehouse. I think this is where Teradata can compete, although it creates a deep copy of the table.

In Snowflake, you can even create instant backups of entire databases in a short time by cloning a whole database.

Is zero-copy cloning a replacement for a database backup?

Cloning is no replacement for a disaster recovery solution with a backup that is stored redundantly.

As demonstrated, creating zero-copy clones — also known as shallow copies of tables — simplifies many processes.

To my knowledge, no parallel shared-nothing systems, such as Teradata, provide this capability.

Is it feasible to introduce this feature in Teradata? I am uncertain. A prerequisite would be that the source table and its duplicate have identical primary indexes. Teradata’s internal structures, such as the Cylinder Index and Master Index, are not suited for generating shallow replicas.

Are there any other MPP systems that support zero-copy cloning besides Snowflake? If so, please share by leaving a comment below.

Related Services

🏗️ Planning a Data Platform Migration?

Architecture-first approach: we design before a single line of code is written. Zero data loss across every migration delivered.

Our Migration Services →

What Are Zero-Copy Clones?

What is the basis for zero-copy clones in Snowflake?

How do zero-copy clones work in Snowflake?

What is the advantage of zero-copy clones over the traditional method of copying tables?

Is zero-copy cloning a replacement for a database backup?

📊 Data Platform Migration Survey

Stay Ahead in Data Warehousing

Leave a Comment Cancel reply