Skip to content

Latest commit

 

History

History
116 lines (66 loc) · 5.94 KB

File metadata and controls

116 lines (66 loc) · 5.94 KB

Great Expectations (GE) - Execute Business Rules

Description

The "GE - Execute Business Rules" Custom Step enables SAS Studio Flow users to run business rules on data based on the Python Great Expectations package. This Custom Step is employed to measure the quality of data in terms of data accuracy, validity, completeness, uniqueness, and timelines. Generally, business rules produce a single result, whether the rule "fired" successfully (True) or not (False), indicating if the data meets the expectations of the rule.

Publicly available sample csv file with 10000 rows of the Yellow Taxi Trip from the NYC taxi data you can use for testing purposes is provided in the sample_dataset folder.

In this Custom Step version, the following great expectations rules have been included:

In addition, further information on the glossary of expectations can be found here.

Why This Custom Step?

This node was created as a demo to implement Great Expectations' data quality rules in a SAS Studio Flow and to assess how well the Custom Step framework is able to support GE rules. See Great Expectations - Genenerate Expectation Suite node on how to generate rules in SAS Studio Flow.

The node is not extensively tested, but feedback is welcome to get it improved and to add capabilities if needed.

User Interface

GE - Execute Business Rules Step

SAS Studio Flow execution of a business rule example

Create Rules Tab

Main properties of the rules generation node:

Options Section

The options for Python great expectation's rule selected: An option is displayed if it is an argument of the rule chosen.

Output Tab

Main outputs for running a rule on data:

  • Good records table - contains records that meet the rules's criteria
  • Bad records - contains records that failed the rules's criteria
  • Exceptions table - details statistics of the failed records and what rule has fired
  • Rules report - parsing out the GE's .json output as an output table

About Tab

Requirements

Built and tested on SAS Viya Stable Release 2023.03.

Python's great_expectations library version v0.15.0 or after

Usage

Installation

Download the GE - Execute Business Rules.step file, upload it into your environment and start using it.

This Custom step requires that Python be deployed and available in your SAS environment. The easiest way to achieve this is to enable and configure sas-pyconfig job, which also brings along the GE package, following the steps indicated in this article.

Alternatively, one can run this Custom step by first pip installing Python and GE. Follow the steps below to get GE into your environment:

import pip

import os

os.getcwd()

pip.main(['install', 'great_expectations', '--target=.'])

sys.path.append('./local/bin')

sys.path

import great_expectations as ge

A demo is provided in the example below:

Example

Change Log

Version 1.1 (11OCT2023)

  • Renaming the .step file and adding About tab.

Version: 1.0 (28APR2023)

  • Initial version