fbpx

Teradata Fastload with Python

By DWH Pro Admin

January 24, 2021


In this article, we assume that you already have basic knowledge about Python. We will show step by step how easy it is to load data using Python and the Teradata SQL Driver for Python using Fastload.

If you are using Windows, we recommend using WinPython if you don’t have Python installed yet. Because there is already Jupyter Notebook included, we use it to show how easy it is to load data into Teradata.

You can download the latest version of WinPython here:

Download latest version of WinPython

Below you can see the target table definition:

CREATE MULTISET TABLE DWHPRO.PythonFastload 
(
      PK  INTEGER,
      TheDate DATE FORMAT 'YY/MM/DD',
      TheInteger INTEGER,
      TheDecimal DECIMAL(10,2),
      TheVarchar VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC
) PRIMARY INDEX ( PK );

The Required Packages

We need the following packages to access Teradata:

  • teradatasql
  • pandas
  • numpy
  • faker

The additional packages (pandas, numpy, faker) are only used to create some fake and random data we can load:

import pandas as pd
import numpy as np
import teradatasql

import faker.providers as f
from faker import Faker
fake=Faker()
from datetime import datetime

In the next step we connect to Teradata and open a cursor:

con = teradatasql.connect ('{"host":"127.0.0.1","user":"dbc","passsword":"dbc","database":"DWHPRO"}')
cur = con.cursor ()

We create a helper function to have some random dates to insert:

def random_dates(start, end, size):
divide_by = 24 * 60 * 60 * 10**9
start_u = start.value // divide_by
end_u = end.value // divide_by
return pd.to_datetime(np.random.randint(start_u, end_u, size), unit="D").date

Function executing the Fastload

Below function uses a data frame for the random/fake data we want to load.

def fload(size_tran = 100000):
    """Uses Teradata FastLoad method to load data
    """
    try:
        sRequest = "DROP TABLE DWHPRO.PythonFastload"  
        print (sRequest)
        cur.execute (sRequest)
    except Exception as ex:
        print ("Ignoring", str (ex).split ("\n") [0])

    sRequest = """CREATE MULTISET TABLE DWHPRO.PythonFastload  ( 
     PK INTEGER
    ,TheDate DATE
    ,TheInteger INTEGER
    ,TheDecimal decimal(10,2)
    ,TheVarChar varchar(20)
    )
    """#
    print (sRequest)
    cur.execute (sRequest)

    
    sRequest = "{fn teradata_nativesql}{fn teradata_autocommit_off}"
    print (sRequest)
    cur.execute (sRequest)

    size = 100000 
    start = datetime.now()
    tran = pd.DataFrame( )
    tran["PK"] = [x+1 for x in range(size_tran)]  
    tran["TheDate"] = random_dates(start=pd.to_datetime('2021-01-01'), end=pd.to_datetime('2021-12-31'), size=size_tran)
    tran["TheInteger"] = np.random.choice([x+1 for x in range(size)] , size =size_tran)
    tran["TheDecimal"]= [ np.random.randint(10, 100000) / 100 for x in range(size_tran)] 
    tran["TheVarChar"]=   np.random.choice([fake.iban()  for _ in range(1000)] , size = size_tran)


    tran_list = tran.values.tolist()
    parameter = ",?"*len(tran_list[0]) 
    parameter = parameter[1:]

    aaoValues= tran.values.tolist()

    sInsert = "{fn teradata_try_fastload}INSERT INTO DWHPRO.PythonFastload ("+parameter+")"

    print (sInsert)
    cur.execute (sInsert, aaoValues)

    con.commit()
   
    print(f"transaction: data generated {datetime.now()-start}") 
    print(f"transaction: data inserted {datetime.now()-start}")

Finally, we call the function and execute the Fastload, close the cursor, and the connection to Teradata:

fload(size_tran = 100000)

cur.close()
con.close()

If you get an error message when playing with this code in Junyper notebook that too many load or unload tasks are active, you can easily kill these sessions with the below command:

SELECT AbortSessions (1, 'DBC', 0, 'Y', 'N');

The second parameter kills all sessions of this user. If you want to kill a specific session instead, put there the sessionid.

Buy now at Amazon
  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >