All libraries are avilable in Anaconda distribution of Python. The code should run using Python versions 3.*.
I will analyse a dataset to see how data analysis can help us to get insights from the data. The project uses one of the famous opened datasets in the internet which is AirBnb dataset (Boston AirBnb and Seattle AirBnb datasets).
Here, I will try to answer four questions which are:
- What is most common property type for rent in each state?
- What aspects correlate well to the listing review scores rating?
- What is the average home rental price in Seattle and Boston? and in which seasons do the prices spike?
- How can I predict the listings prices in Boston and Seattle?
In order to answer the questions, CRISP-DM process is followed.
- One notebook file `Aribnb_Data_Science.ipynb` to answer the four questions.
- Two folders the first one is Boston dataset and the second one is Seattle dataset. Each of the two folders contains three csv files:
- `Listings.csv` includs full descriptions and average review score.
- `Calendar.csv` includs listing id and the price and availability for that day.
- `Reviews.csv` includs unique id for each reviewer and detailed comments.
The main findings of the code is available here.
The Airbnb dataset is avilable in Kaggle website: Boston Airbnb and Seattle Airbnb.