Skip to content

Latest commit

 

History

History
64 lines (45 loc) · 2.81 KB

File metadata and controls

64 lines (45 loc) · 2.81 KB

ADLS File Reader

Description

The Import - ADLS File Reader provides an easy way to connect and read Parquet and Delta Lake files from Azure Data Lake Storage (ADLS) to SAS Compute or CAS.

It supports reading snappy compressed Parquet and DeltaLake file formats and allows reading from partitioned tables (hierarchical nested subdirectories structures commonly used when partitioning the datasest a very common approach when storing datasets on data lakes). Its supports expression filters push-down using any of the dataset fields which avoid reading and transferring unnecessary data between the origin and source destination (when used with partitioned fields it's known as partition pruning)

This custom step helps to work around some of the restrictions that currently exist for working with Parquet files in SAS Viya. Please check the following documentation that lists those restrictions for the latest SAS Viya release:

User Interface

  • Options tab

    Standalone mode Flow mode
  • Options tab

  • About tab

Requirements

This customs step depends on having a python environment configured with the following libraries installed:

  • pandas
  • saspy
  • azure-identity
  • pyarrow
  • adlfs

Tested on Viya version Stable 2023.03 with python environment version 3.8.13 and the libraries versions:

  • pandas == 1.5.2
  • saspy == 4.3.3
  • azure-identity == 1.12.0
  • pyarrow == 10.0.1
  • adlfs == 2023.1.0

Usage

Change Log

  • Version 2.0.1 (25APR2023)
    • Fixed issues related to reading non-partitioned parquet and deltalake files
  • Version 2.0 (20APR2023)
    • Added support for reading Delta Lake file format
    • Removed pyarrowfs-adlgen2 python dependency
    • Added adlfs python library as filesystem library implementation to access ADLS
    • Some code refactoring focusing on a object-oriented implementation for ADLSFileReader
  • Version 1.0 (02FEB2023)
    • Initial version