Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 379 Bytes

README.md

File metadata and controls

4 lines (3 loc) · 379 Bytes

Rank page of the Wikipedia with Spark

The objective of this project is to assign a rank to each page of the Wikipedia using Spark. The first part of the project is data preprocessing and then page rank algorithm is used. I recommend to use DataBricks to be able to run the code and use a small sample of the Wikipedia pages.

This project is developed with python and spark.