Skip to content

Latest commit

 

History

History
74 lines (50 loc) · 3.01 KB

File metadata and controls

74 lines (50 loc) · 3.01 KB

ETL-Validation-with-python-scripting

Cross DB ETL Validation (Oracle to postgresql) with python scripting

Table of Contents

Overview

Design and Develop ETL pipeline from Oracle to Postgresql using informatica Powercenter.
Perform ETL validation using Python scripting.

ETL Validation using Python scripting

• Connect to Source: oracle DB from Config file
• Read Source queries and load to DataFrame
• Connect to Target: pgAdmin from config file
• Read Target Queries and load to DataFrame
• Validate data with the help of Pandas module in python
• Check for count, duplicates, null, truncation and transformation in source and target
• Write test results into a separate file

image image

Design and Development

• Import source and target details to the Designer
• Design Mapping from source to target with necessary transformation
• This is Cross DB ETL. Source is Oracle and Target is Pgadmin
• Lookup Transformation is used to fetch department name
• Filter transformation to limit the entry of data into target table
• Expression transformation used to replace null values for designation
• Sorter transformation is used to get distinct employee ids and to sort employees in Ascending order

image image

Data Validation

• Check Run properties and session logs for the detailed report
• Check Count in Source and Target tables
• Check all the transformation logic are fulfilled such as (duplicates not loaded, Null values replaced)
• Check any data is truncated
• Using python scripts automate the testing of data validation and get results into a text file.

image


🛠 Tools used

Authors

🛠 Skills

SQL, ETL, Python, Power BI...

🔗 Links

linkedin