- Created a tool that estimates car prices (MAE ~ ₦ 1.8M) to help sellers and buyers know what to pay for and sell.
- Scraped over 2,000 car descriptions from cars45 and autochek.africa using Python and Playwright.
- Performed exploratory data analysis to gather insights on car prices, trends, and popularity in the region.
- Optimized Ridge, Decision Tree, and Random Forest Regressors using GridsearchCV to reach the best model.
- Created external Tableau dashboard for visualization.
- Built a client facing web application using Flask.
Because the data was scraped from multiple sources, extensive data cleaning and preparation was required, which includes the following:
- Removed brackets and commas from all features that used them.
- Price numeric data was parsed.
- Colors were standardized, and multiple colors were renamed to be one.
- Brand and Model were parsed, formatted, and standardized to be the same across all sources.
- Several outliers were removed, and engine capacity was standardized at milliliters. e.t.c
For the Exploratory Analysis, various factors and trends affecting car prices in the region were investigated, as well as questions such as the popularity of brands and models.
Firstly, due to the sheer numbers of car models and brands present, target mean encoding was used to encode them, while for the remaining categorical features, ordinal and one hot encoding was used as required.
Three different models were created and evaluated using Mean Absolute Error.
The Random Forest model far outperformed the other approaches on the test and validation sets.
Random Forest : MAE = 1.8M
Decision Tree: MAE = 2.0M
Ridge Regression: MAE = 2.2M
This step involved creating a Flask web application and API that was hosted on a local webserver. Users can enter and submit information about the vehicle they want to estimate, and the application will return the estimated price.