- Batch Processing using Apache Spark and Python3 for data exploration
- Dataset was downloded from https://www.kaggle.com/
- Focusing on Pyspark SQL libraries
- from pyspark.sql.types import BooleanType
- from pyspark.sql.functions import udf
- from pyspark.sql import functions as F
- from pyspark.sql import SparkSession
- from pyspark.sql import Window