Data Modeling and the Teradata FSLDM
- What is A Data Model?
- What are the Major Types of Data Models?
- What is FS-LDM?
- How is FS-LDM Implemented?
- Dimensional Modeling (DM) is a modeling Technique in data warehousing, visually represented as a fact table surrounded by dimension tables.
- Some of the terms commonly used in this type of modeling:
- Dimension: A category of information. For example, the time dimension.
- Attribute: An individual level within a dimension. For example, Month is an attribute in the Time Dimension.
- Hierarchy: The specification of levels that represents the relationship between different attributes within a dimension. For example, one possible hierarchy in the Time dimension is Year → Quarter → Month → Day.
- Fact Table: A fact table is a table that contains the measures of interest. For example, sales amount would be such a measure.
- Lookup Table: The lookup table provides the detailed information about the attributes. For example, the lookup table for the Quarter attribute would include a list of all of the quarters available in the data warehouse.
- In designing data models for data warehouses/data marts, the most commonly used schema types are Star Schema and Snowflake Schema.
- Star Schema: In the star schema design, a single object (the fact table) sits in the middle and is radially connected to other surrounding objects (dimension lookup tables) as a star. A star schema can be simple or complex. A simple star consists of one fact table; a sophisticated star can have more than one fact table.
- Snowflake Schema: The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increasing number of lookup tables.
Relational Data Model
- The relational data model is entirely based on the predicate logic and set theory of mathematics.
- The data is arranged in a relation which is visually represented in a two-dimensional The data is inserted into the table in the form of tuples (which are nothing but rows). A tuple is created by one or more than one attributes, which are used as basic building blocks in the formation of various expressions that are used to derive a meaningful information. There can be any number of tuples in the table, but all the tuple contain fixed and same attributes with varying values.
- The relational model is implemented in database where a table represents a relation; a row represents a tuple, an attribute is represented by a column of the table
All the relational algebra operations, such as Select, Intersection, Product, Union, Difference, Project, Join, Division, Merge, etc. can also be performed on the Relational Database Model. Operations on the Relational Database Model are facilitated with the help of different conditional expressions, various key attributes, pre-defined constraints, etc. For example selection of information of the customer, who is living in some city for more than 20 years.
- Following are few terms used in relational database model:
Candidate Key: Any field or a combination of fields that identifies a record uniquely is called a Candidate Key. The Candidate Key cannot contain a NULL value and should always provide a unique value.
Primary Key: Primary key is nothing but a candidate key that identifies a record uniquely.
Foreign Key: A Foreign key is a primary key for another table, in which it uniquely identifies a record. A Foreign Key defines the relation between two (or more) tables. A Foreign key can contain a NULL
Constraints: Constraints are logic rules that are used to ensure data consistency or avoid certainly unacceptable operations on the data.
Relational Model: Summary
- A tabular representation of data.
- Straightforward and intuitive, currently the most widely used.
- Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations.
- Two critical ICs: primary and foreign keys
- Also, we always have domain constraints.
- Powerful and natural query languages exist.
Introduction to the Teradata FSLDM
LOGICAL DATA MODEL
- A logical data model is a graphical representation of how data is organized. It specifically defines which individual data elements are stored and how they relate to each other.
- The data model ultimately defines which business questions can be answered and thus determines the business value of the entire strategic and tactical data warehouse environment.
Data Modeling Notations
Why Do a Logical Data Model
- The Logical Data Model is a representation of the things of significance to an enterprise and the relationships between those things. A Logical Data Model reflects the rules of the business.
- A business rule, in turn, indicates how a company does business and declares its operating principles and policies.
- Two examples are: “An account may belong to more than one individual, or an individual can have one or more addresses.
- The Logical Data Model is a way to show the rules of the business graphically. It can function as a means of Communication and as a data management tool.
- Also: It serves as a roadmap for achieving data integration in an organization. It is a guide for development over the long term.
- It shows interlocking parts. Understanding all of the interdependent parts makes it easier to expand the model for future enhancements.
- It forms a foundation upon which to build applications or business views.
- It allows you to recognize data redundancy and to control it. Data redundancy can lead to inaccurate and inconsistent reporting of business information.
- It serves as a starting point for developing the physical model amid physical database design.
- It aids communication between the analyst and the business user and between
- As a rigorous Technique. It imposes discipline on the warehouse development process and leads to the development of stable. Robust. Long term and reliable solutions.
- As a communication tool. It allows you to understand what you are building before you build it. Moreover, it allows you to see what you made after you built it.
What is the Teradata FSLDM?
- Teradata Financial Services Logical Data Model, also known as the Teradata FSLDM. The Teradata FSLDM is a third normal form; logical data model defined down to the specific attribute level.
- It can be used as a starting point for financial institutions and insurance companies who are interested in a rapid approach for achieving an organized and integrated view of their business data.
- The term ‘Financial Services” encompasses securities/investments, insurance, retail banking, Commercial banking, credit card and other financially related products And services.
This data model is used by any business that sells the products above and services, including banks. Insurance Companies, credit card companies, and brokerages.
What are the benefits of using the Teradata FSLDM?
- It is a flexible and dynamic model that you can use for the long term.
- This means that as your business changes (e.g. you change your organizational structure or product offerings) there is minimal impact or no impact on the data model. You have a solid database design. Moreover, it is extendable.
- It jump-starts the data modeling step in the development life cycle. It is a good starting point when developing your logical data model, and provides checkpoints to make sure you have covered the necessary areas in your scope.
- It is the assimilation of Teradata best-practices knowledge. Gained from many years of designing logical data models for financial institutions.
- It is the result of analyzing many customer-based data models that Teradata has developed for other industries.
- It has been validated at customer engagements.
- It saves time. By using the experience built in this model much less time is needed. Moreover, therefore cost less.
- It is cross-functional and integrated—has one customer view across all products and functions.
- It is a roadmap for future integration and enhancement. You can start small but grow into the model knowing that it will integrate in the future.
- It is broad in scope, robust. Flexible. Moreover, customer-oriented. The model tracks relationships with any party of interest to the financial institutions. It supports a customer management vision—an Environment that enables financial institutions to develop comprehensive knowledge about their customers and to optimize the profitability of each relationship.
- It is a data warehouse data model—ideal for use as a basis for Customer Relationship Management (CRM).
- It is third normal form. The design is pure and unconstrained by operational or physical considerations. It can be implemented as near third normal form on Teradata. A third normal form model has one fact in one place—in the right place. A third normal form model has no redundant data elements. (A model with redundant data elements can lead to misinformation.)
- It is not based on a star schema and hence is not limited to a finite collection of business questions. A third normal form model is flexible in answering unanticipated business questions. With a star schema design. You are limited as to the type of business questions you can ask.
- It accommodates changes of currency, as it is the case with the Euro.
FSLDM Implementation on Teradata
- The FS-LDM is implementation on Teradata consists of following Steps / Phases
- Implementation of the Business Rules in the Logical Model
- Mapping the Source Data columns to the Destination Data columns
- Converting the Logical Model into Physical Data Model by Modeling Tools (Erwin)
- Implementation of the Physical Model into the Database (Object Creation, PI, PPI Decisions, Compression Analysis, and Implementation)
- Development and implementation of the ETL Jobs on the Tables
- Application Development / Access of the Data
Implementation of the Business Rules in the Logical Model
- Usual Methods
- Discussions with the Data design Architects, business stakeholders and coming up with detailed Business Rules for Implementation
- Workshops for the Design Etc
- The Output is the full Business Rules for Implementation and a signoff by involved parties.
A comprehensive Plan of Implementation is Drawn-out by the Design Architects, for the implementations
Creation of Logical Data Modeling
- The Data Architect will then create the Logical Data model, using the Various LDM
- The Structures of the Tables, Relationships (Business Rules ), etc. are Logically Designed and signed off for Physical Implementation
- The Most commonly used tool for this is ER-Win
- The Logical Data Model is stored on a Common project server; Access restricted to the Data Modeling team
Mapping the Source Data columns to the Destination Data columns
- This is usually the Task of the Data Modeling Team, in consultation with the LDM, and the Business Users
- In-depth Discussions with the Business users to Know about the Various Input Sources, and source files, source Columns, etc.
- The Mapping process also looks into the Destination table structure, decides on the destination table columns Etc
- Note that the Mapping done is on the LDM, and the Details are Passed on to the ETL Team. None of the Physical Implementation is done at this Stage
Converting the Logical Model into Physical Data Model by Modeling Tools
- The Logical Data Model is then transformed into the Physical Model
- The Process usually is done using the Data Modeling Tool (such as ERWin)
- The Basic Object Definition, along with Primary, and Secondary Keys, Relationships, etc. are defined at this Stage in Physical Model
Implementation of Physical Data Model
- The DBA’s implements and converts the LDM into a Physical Model.
- This include
- Object Creation
- Object Access
- Data Integrity Process
- Primary Key, and Secondary Key Implementation
- Load Jobs Creation
- Control Tables implementation
- Compression Analysis and implementation
- ETL Jobs
- Once the Implementation of the Model is Completed the Development Team will then either Re-Create or Develop from Scratch, the ETL Jobs involved in loading the Tables
- Phase wise implementation are also looked at in a similar fashion
Application Development and Data Access
- Once the ETL jobs are developed and implemented, application and end users are given access to the data tables, as required by the application teams
- The reporting team looks at implementation of reports
- The Tactical Workload Team implements the Required TTW Workload on the Tables