--- This repository is coded in Python 3.6 using Jupyter Notebooks ---
Customer Churn is used to describe subscribers to a service who decide to discontinue their service for a certain time frame. Churn prediction consists of detecting which customers are likely to cancel a subscription to a service based on how they use the service.
Businesses often have to invest substantial amounts attracting new clients, so every time a client leaves it represents a significant investment lost. Both time and effort then need to be channelled into replacing them. Being able to predict when a client is likely to leave and offer them incentives to stay can offer huge savings to a business.
The data consisted of 7 numerical variables, 2 categorical, 2 date and 1 boolean.
Distribution of means of all numerical variables indicated data needed to be scaled before modelling.
Android users experiencing higher surge than iPhone users
Correlation between features
-
- Logistic Regression
-
- Decision Tree
-
- Random Forest
-
- Gradient Boosting
-
- AdaBoost
- Number of estimators
- Learning Rate
- Maximum depth
The curve is obtained using the Gradient Boosted model.
- This chart shows that picking a low threshold works in the best interest as 63% of customers are churning (previously observed!). If we give all the customers a retention incentive, it would cost us around USD 200,000. However, with this model and the threshold of 0.26, the overall cost can be minimized at USD 183,720. The graph also shows, how not taking any action would be disastrous as the costs are sky-rocketing at USD 438,130.
Not taking any action on the customers, would cost the company a lot of money- around USD 438,000. With usage of the model and analysis, company can reach out to the churning customers with incentives and minimize the cost to USD 183,000.
- Also, Android users are expriencing a higher surge charge compared to iPhone users - and that is causing them to churn.
- Customers in Astapor are churning more and King's Landing are less likely to churn.
- Also, company should promote usage of luxury_car in the first 30 days as it is promoting retention.
- Finally, riding during the week and ratings provided by the driver are important factors to look for to identify churning customers.
- Some customers who receive retention incentives will still churn. Including a probability of churning despite receiving an incentive in our cost function would provide a better ROI on our retention programs.
- Actual training data and monetary cost assignments could be more complex.
- Multiple models for each type of churn could be needed.