Teradata Historisation Part 2/5: Downleveling cases

0
86

teradata historisation

Part 2 of the Teradata historisation merge series discusses the dehistorisation scenarios. Assume that the task is presented to you as bringing table contents into a “stable” or “final” state or as copying the contents into a “present-only” environment while you merge. For an understanding of the table design and content, please refer to part 1 of this article series. Historisation down leveling cases are all of the following:

(1-0-0, 0-1-0, 1-1-0, 0-2-0, 2-2-0, 1-2-0, 0-2-1, 1-2-1, 2-2-1)

 

Case 1-0->0 We downlevel table A to no historisation, merge it with unhistorised table B and insert the result into a table without historisation. In concrete SQL terms:

— CASE1-0->0

SEL COALESCE(B.ENT_ID, A.ENT_ID) AS ENT_ID, A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM

(

— dehistorise A to most recent per ENT

sel A.ENT_ID, A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE

FROM TABLE_A1 A

WHERE (A.ENT_ID, A.CDD) IN

(SELECT ENT_ID, MAX(CDD) AS CDD FROM TABLE_A1 GROUP BY 1 )

) A

FULL OUTER JOIN

TABLE_B0 B

ON A.ENT_ID = B.ENT_ID;

 

Case 0-1->0 We downlevel table B this time, all other elements of the task are similar to the previous case.

 

Case 1-1->0 This case of a straight dehistorisation requires to merely select the most recent entries from both tables A and B, strip them of their history fields and insert the result into the target table.

 

Case 0-2->0 As we dehistorise table B, we only select to most new and valid layer from table B per key, leave out the history attributes and merge with table A.

 

Case 2-2->0 As the goal is again to dehistorise; we just have to adapt the SQL code so as to take the high historisation levels on both sides into account. All other assumptions being equal, the SQL looks like this:

— CASE 2-2-> 0

SEL COALESCE(B.ENT_ID, A.ENT_ID) AS ENT_ID, A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM

(

— dehistorise A to most recent per ENT

sel A.ENT_ID, A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE

FROM TABLE_A2 A

WHERE A.VED>CURRENT_TIMESTAMP

AND (A.ENT_ID, A.RSD) IN

(SELECT ENT_ID, MAX(RSD) AS RSD FROM TABLE_A2 WHERE VED>CURRENT_TIMESTAMP GROUP BY 1)

) A

FULL OUTER JOIN

(

— dehistorise B to most recent per ENT

sel B.ENT_ID, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM TABLE_B2 B

WHERE B.VED>CURRENT_TIMESTAMP

AND (B.ENT_ID, B.RSD) IN

(SELECT ENT_ID, MAX(RSD) AS RSD FROM TABLE_B2 WHERE VED>CURRENT_TIMESTAMP GROUP BY 1)

) B

ON A.ENT_ID = B.ENT_ID;

 

Case 1-2->0 Here, we need to match the two divergently historised tables together correctly to dehistorise the result in the end. To perform this, we have to make the intermediary assumptions that the valid history is to be reduced to the most recent entry per key only and that one historical state of the entity is to be considered the “solid” one in a target without history. Typically, this state is the most recent entry.

— CASE 1-2-> 0

SEL COALESCE(B.ENT_ID, A.ENT_ID) AS ENT_ID, A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM

(

— dehistorise A to most recent per ENT

sel A.ENT_ID, A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE

FROM TABLE_A1 A

WHERE (A.ENT_ID, A.CDD) IN

(SELECT ENT_ID, MAX(CDD) AS CDD FROM TABLE_A1 GROUP BY 1 )

) A

FULL OUTER JOIN

(

— dehistorise B to most recent per ENT

sel B.ENT_ID, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM TABLE_B2 B

WHERE B.VED>CURRENT_TIMESTAMP

AND (B.ENT_ID, B.RSD) IN

(SELECT ENT_ID, MAX(RSD) AS RSD FROM TABLE_B2 WHERE VED>CURRENT_TIMESTAMP GROUP BY 1)

) B

ON A.ENT_ID = B.ENT_ID;

 

Case 0-2->1 Here we perform a double-step of shifting to a different historisation level while we integrate an unhistorised table. Again, there is room for interpretation on the unhistorised A table and the treatment of entities not present in B. In any case, we treat merged histories with missing B entries as if the history of this combined result started at the merge date. Derived dates based on specific knowledge about the genesis of the source tables are possible as well.

There are two SQL versions based on how we interpret the non-historised part:

— CASE 0-2-> 1, FIRST Interpretation: A Table shows alltime valid history

SELECT

COALESCE(B.ENT_ID, A.ENT_ID) AS ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE, COALESCE(B.RSD,CURRENT_DATE), COALESCE(B.RED, DATE’3000-12-31′)

FROM

TABLE_B2 B

FULL OUTER JOIN

TABLE_A0 A

ON B.ENT_ID = A.ENT_ID

WHERE

(B.VED>CURRENT_TIMESTAMP

OR B.VED IS NULL )

;

 

— CASE 0-2-> 1, Second Interpretation: A Table shows most recent history only

SELECT

COALESCE(B.ENT_ID, A.ENT_ID) AS ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE, COALESCE(B.RSD,CURRENT_DATE), COALESCE(B.RED, DATE’3000-12-31′)

FROM

TABLE_B2 B

FULL OUTER JOIN

TABLE_A0 A

ON B.ENT_ID = A.ENT_ID

AND B.RED>DATE

WHERE

(B.VED>CURRENT_TIMESTAMP

OR B.VED IS NULL )

;

 

Case 1-2->1 This is the first instance that demands a more sophisticated treatment because two different history streams have to be aligned such that every time a change occurred either for table A or table B, a new resulting history row emerges. We execute this by first summarizing all distinct RSD points in time from both tables. Then we rejoin both tables to this timeline. Finally, we summarize the constellations along the timeline to create a new RSD and RED history stream.

— CASE 1-2->1

SELECT ENT_ID, THESTARTDT, (CASE WHEN THEENDDT>CURRENT_DATE THEN DATE’3000-12-31′ ELSE THEENDDT END) AS THEENDDT,

C_STATUS ,D_LEVEL_CD , E_VALUE,F_STATUS, G_LEVEL_CD, H_VALUE

FROM

(

SELECT

ENT_ID, C_STATUS ,D_LEVEL_CD , E_VALUE,F_STATUS, G_LEVEL_CD, H_VALUE, MIN(THEDATE) AS THESTARTDT, MAX(CASE WHEN THEEND<THEDATE THEN RED ELSE THEEND END) AS THEENDDT

FROM

(

SELECT U.ENT_ID, U.RSD AS THEDATE,

MAX(U.RSD) OVER (PARTITION BY U. ENT_ID ORDER BY U. ENT_ID, U.RSD ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING ) -1 as THEEND,

A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, A.RSD, A.RED,B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM

(select ENT_ID, RSD

FROM

(— Convert A history to RSD, RED

sel A.ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE,

MIN(CDD) AS RSD,

MAX(CDD) AS RED

FROM TABLE_A1 A

GROUP BY 1,2,3,4 ) A

GROUP BY 1,2

UNION

select ENT_ID, RED

FROM

(— Convert A history to RSD, RED

sel A.ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE,

MIN(CDD) AS RSD,

MAX(CDD) AS RED

FROM TABLE_A1 A

GROUP BY 1,2,3,4 ) A

GROUP BY 1,2

UNION ALL

select ENT_ID, RSD

FROM

TABLE_B2 B

WHERE B.VED>CURRENT_TIMESTAMP

group by 1,2

UNION

select ENT_ID, RED

FROM

TABLE_B2 B

WHERE B.VED>CURRENT_TIMESTAMP

group by 1,2

) U

LEFT OUTER JOIN

(— Convert A history to RSD, RED

SEL A.ENT_ID, C_STATUS ,D_LEVEL_CD , E_VALUE, A.RSD, (CASE WHEN A.RED= AM.THEMAXDT THEN DATE’3000-12-31′ ELSE A.RED END) AS RED

FROM

(sel A.ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE,

MIN(CDD) AS RSD,

MAX(CDD) AS RED

FROM TABLE_A1 A

GROUP BY 1,2,3,4 ) A

JOIN

(Select A.ENT_ID, MAX(CDD) AS THEMAXDT

FROM TABLE_A1 A

GROUP BY 1 ) AM

ON A.ENT_ID = AM.ENT_ID

) A

ON U.ENT_ID=A.ENT_ID

AND U.RSD BETWEEN A.RSD AND A.RED

LEFT OUTER JOIN

TABLE_B2 B

ON U.ENT_ID=B.ENT_ID

AND U.RSD BETWEEN B.RSD AND B.RED

AND B.VED>CURRENT_TIMESTAMP

) B

GROUP BY ENT_ID, C_STATUS ,D_LEVEL_CD , E_VALUE,F_STATUS, G_LEVEL_CD, H_VALUE

–HAVING THESTARTDT <=CURRENT_DATE

) F

order by 1,2,3;

 

Case 2-2->1 Practically speaking, this case is just a variation of the 1-2->1 case with both sides now having a validity layer that we flatten away by whittling down the list of rows to using the valid time layer only.

— CASE 2-2->1

SELECT ENT_ID, THESTARTDT, (CASE WHEN THEENDDT>CURRENT_DATE THEN DATE’3000-12-31′ ELSE THEENDDT END) AS THEENDDT,

C_STATUS ,D_LEVEL_CD , E_VALUE,F_STATUS, G_LEVEL_CD, H_VALUE

FROM

(

SELECT

ENT_ID, C_STATUS ,D_LEVEL_CD , E_VALUE,F_STATUS, G_LEVEL_CD, H_VALUE, MIN(THEDATE) AS THESTARTDT, MAX(CASE WHEN THEEND<THEDATE THEN RED ELSE THEEND END) AS THEENDDT

FROM

(

SELECT U.ENT_ID, U.RSD AS THEDATE,

MAX(U.RSD) OVER (PARTITION BY U. ENT_ID ORDER BY U. ENT_ID, U.RSD ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING ) -1 as THEEND,

A.C_STATUS ,A.D_LEVEL_CD , A.E_VALUE, A.RSD, A.RED,B.F_STATUS, B.G_LEVEL_CD, B.H_VALUE

FROM

(select ENT_ID, RSD

FROM

(— Convert A history to RSD, RED

sel A.ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE,

MIN(RSD) AS RSD,

MAX(RED) AS RED

FROM TABLE_A2 A

WHERE A.VED>CURRENT_TIMESTAMP

GROUP BY 1,2,3,4 ) A

GROUP BY 1,2

UNION

select ENT_ID, RED

FROM

(— Convert A history to RSD, RED

sel A.ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE,

MIN(RSD) AS RSD,

MAX(RED) AS RED

FROM TABLE_A2 A

WHERE A.VED>CURRENT_TIMESTAMP

GROUP BY 1,2,3,4 ) A

GROUP BY 1,2

UNION ALL

select ENT_ID, RSD

FROM

TABLE_B2 B

WHERE B.VED>CURRENT_TIMESTAMP

group by 1,2

UNION

select ENT_ID, RED

FROM

TABLE_B2 B

WHERE B.VED>CURRENT_TIMESTAMP

group by 1,2

) U

LEFT OUTER JOIN

(— Convert A history to RSD, RED

SEL A.ENT_ID, C_STATUS ,D_LEVEL_CD , E_VALUE, A.RSD, (CASE WHEN A.RED= AM.THEMAXDT THEN DATE’3000-12-31′ ELSE A.RED END) AS RED

FROM

( sel A.ENT_ID, A. C_STATUS ,A.D_LEVEL_CD , A.E_VALUE,

MIN(RSD) AS RSD,

MAX(RED) AS RED

FROM TABLE_A2 A

WHERE A.VED>CURRENT_TIMESTAMP

GROUP BY 1,2,3,4 ) A

JOIN

( Select A.ENT_ID, MAX(RED) AS THEMAXDT

FROM TABLE_A2 A

WHERE A.VED>CURRENT_TIMESTAMP

GROUP BY 1 ) AM

ON A.ENT_ID = AM.ENT_ID

) A

ON U.ENT_ID=A.ENT_ID

AND U.RSD BETWEEN A.RSD AND A.RED

LEFT OUTER JOIN

TABLE_B2 B

ON U.ENT_ID=B.ENT_ID

AND U.RSD BETWEEN B.RSD AND B.RED

AND B.VED>CURRENT_TIMESTAMP

) B

GROUP BY ENT_ID, C_STATUS ,D_LEVEL_CD , E_VALUE,F_STATUS, G_LEVEL_CD, H_VALUE

–HAVING THESTARTDT <=CURRENT_DATE

) F

order by 1,2,3;

Our Reader Score
[Total: 2    Average: 5/5]
Teradata Historisation Part 2/5: Downleveling cases written by Paul Timar on April 24, 2014 average rating 5/5 - 2 user ratings

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.