1. The Forgotten Performance Trick: A Semicolon That Saves Time

For decades, Teradata developers have quietly used one of the smallest but most powerful performance optimizations in BTEQ:
a semicolon at the start of a line.

INSERT INTO SALES_EU   SELECT * FROM SALES_SRC WHERE REGION='EU';
;INSERT INTO SALES_US  SELECT * FROM SALES_SRC WHERE REGION='US';
;INSERT INTO SALES_APAC SELECT * FROM SALES_SRC WHERE REGION='APAC';

This isn’t just a style choice.
It tells Teradata to combine all statements into one multi-statement request, parsed and executed as a single transaction.

The result?
Less journaling, fewer commits, and dramatically faster loads into empty tables.


2. Why It Works: The Transient Journal and Empty-Table Optimization

To understand why that semicolon speeds things up, you need to understand how Teradata protects data integrity.

Every change to a table normally writes entries to the Transient Journal (TJ) — “before” and “after” images of every modified row.
That’s how Teradata can roll back a failed transaction.

But there’s one important exception.

If the target table is empty at the start of an INSERT … SELECT, Teradata knows it can safely skip per-row journaling:
a rollback can simply drop the entire data block, since no pre-existing rows existed.

No per-row logging
Less I/O
Huge performance gain

Now, here’s where the semicolon comes in.

If you run several INSERTs separately:

INSERT INTO SALES SELECT * FROM SALES_Q1;   -- fast (empty table)
INSERT INTO SALES SELECT * FROM SALES_Q2;   -- slow (table now not empty)

Only the first insert skips journaling — because the table becomes non-empty after it.
The next insert logs every new row.

But if you run the same logic as a multi-statement request:

INSERT INTO SALES SELECT * FROM SALES_Q1;
;INSERT INTO SALES SELECT * FROM SALES_Q2;
;INSERT INTO SALES SELECT * FROM SALES_Q3;

Then Teradata parses all three statements together.
At parse time, the target table is still empty for all inserts.
Therefore, Teradata applies the “empty-table” optimization to every one of them.

Effect: All inserts run at full bulk-load speed.
One transaction, one commit, no per-row journaling.


3. Performance in Practice

  • Loads into empty tables can be 2–3× faster when using multi-statement requests.
  • Once the target already contains data, this benefit disappears — Teradata must log each inserted row.
  • The optimization is especially effective for ETL batches that truncate and reload data daily.

This trick is why legacy Teradata ETL scripts often used semicolon-prefixed lines for bulk loads.
It wasn’t a stylistic quirk — it was deliberate performance tuning.


4. Snowflake: No Semicolon Needed — Ever

Snowflake also lets you execute multiple statements in a script, but its semicolons serve only as statement separators, not performance optimizers.

Why?
Because the Snowflake engine operates fundamentally differently.

🚫 No Transient Journal

Snowflake doesn’t log each row’s before/after image.
It uses multi-version concurrency control (MVCC) — every DML change creates new immutable micro-partitions.

Rollback is instant: Snowflake simply discards the new partitions.

⚙️ No Per-Row Commits

Committing is a metadata pointer swap, not a disk write.
So there’s no “commit overhead” to optimize.

🧠 No Shared Parse or Multi-Statement Request

Each statement is optimized independently, and there’s no equivalent of Teradata’s single-request block.
Even if you send multiple statements together, Snowflake treats them separately.


5. So Why Have Semicolons in Snowflake at All?

In Snowflake, semicolons are purely syntactic convenience:

CREATE TABLE SALES_EU AS SELECT * FROM SRC WHERE REGION='EU';
CREATE TABLE SALES_US AS SELECT * FROM SRC WHERE REGION='US';

or

BEGIN;
INSERT INTO STAGING SELECT * FROM RAW;
INSERT INTO FINAL SELECT * FROM STAGING;
COMMIT;

They separate statements — but don’t affect how Snowflake optimizes, logs, or commits.
There’s no measurable performance difference between running statements separately or in one script.


6. Different Systems, Different Physics

AspectTeradataSnowflake
Transaction modelTransient Journal (row-level undo/redo)Multi-Version Concurrency Control (micro-partitions)
Commit behaviorPhysical I/O, journaling, checkpointsMetadata-only pointer swap
Empty-table optimizationSkips journaling, very fastAlways fast — journaling doesn’t exist, pointer swaps
Multi-statement requestReal performance featureSyntactic only
Focus of transaction tuningJournalingBy design

Teradata had to optimize around transaction logging.
Snowflake’s design simply eliminates transaction logging.


7. Performance Comparison

  • Teradata’s semicolon optimization is a genuine performance gain — but limited to empty-table loads.
  • Snowflake achieves similar or better performance by architecture, not by syntax.
  • Multi-Table Inserts (INSERT ALL/FIRST) in Snowflake offer the equivalent benefit of one scan, many targets — something Teradata can’t do natively.

8. Practical Takeaways for Migration Engineers

If you’re migrating Teradata ETL jobs to Snowflake:

  1. Remove semicolon batching logic — it has no impact in Snowflake.
  2. Replace multiple inserts with a single INSERT ALL if the goal was to write to several targets.
  3. Stop worrying about the Transient Journal — it doesn’t exist.
  4. Focus on scan efficiency — minimize how many times large tables are read, not how commits are handled.

In Teradata, you optimize around physical limitations.
In Snowflake, you optimize around logical efficiency.


9. Final Thought

The semicolon optimization in Teradata was brilliant — a low-level hack that made massive databases perform like finely tuned engines.

But Snowflake plays a different game.

It doesn’t need hacks.
It’s built to scale effortlessly, commit instantly, and recover instantly.

Teradata engineers learned to tame physics.
Snowflake engineers changed the physics.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>