Skip to content

vsvale/Apache-PySpark-by-Example

Repository files navigation

Apache PySpark by Example

Apache PySpark by Example Want to get up and running with Apache Spark as soon as possible? If you're well versed in Python, the Spark Python API (PySpark) is your ticket to accessing the power of this hugely popular big data platform. This practical, hands-on course helps you get comfortable with PySpark, explaining what it has to offer and how it can enhance your data science work. To begin, instructor Jonathan Fernandes digs into the Spark ecosystem, detailing its advantages over other data science platforms, APIs, and tool sets. Next, he looks at the DataFrame API and how it's the platform's answer to many big data challenges. Finally, he goes over Resilient Distributed Datasets (RDDs), the building blocks of Spark.

Author

Jonathan Fernandes

Learning Plataform

LinkedIn Learning

What I have done?

  • Benefits of the Apache Spark ecosystem
  • Working with the DataFrame API
  • Working with columns and rows
  • Leveraging built-in Spark functions
  • Creating your own functions in Spark
  • Working with Resilient Distributed Datasets (RDDs)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published