Cross DB ETL Validation (Oracle to postgresql) with python scripting
Design and Develop ETL pipeline from Oracle to Postgresql using informatica Powercenter.
Perform ETL validation using Python scripting.
• Connect to Source: oracle DB from Config file
• Read Source queries and load to DataFrame
• Connect to Target: pgAdmin from config file
• Read Target Queries and load to DataFrame
• Validate data with the help of Pandas module in python
• Check for count, duplicates, null, truncation and transformation in source and target
• Write test results into a separate file
• Import source and target details to the Designer
• Design Mapping from source to target with necessary transformation
• This is Cross DB ETL. Source is Oracle and Target is Pgadmin
• Lookup Transformation is used to fetch department name
• Filter transformation to limit the entry of data into target table
• Expression transformation used to replace null values for designation
• Sorter transformation is used to get distinct employee ids and to sort employees in Ascending order
• Check Run properties and session logs for the detailed report
• Check Count in Source and Target tables
• Check all the transformation logic are fulfilled such as (duplicates not loaded, Null values replaced)
• Check any data is truncated
• Using python scripts automate the testing of data validation and get results into a text file.
SQL, ETL, Python, Power BI...