Skip to content

Latest commit

 

History

History
6 lines (6 loc) · 857 Bytes

README.md

File metadata and controls

6 lines (6 loc) · 857 Bytes

Spark_Classification

There is a problem with high churn these days: basically, too many customers canceling the service. To solve this problem, the marketing team's idea is to offer promotions to customers who are likely to cancel the service, trying to avoid this cancellation. In the meeting with the marketing team, the idea of creating a Machine Learning model capable of classifying customers between possible cancellations or not came up. This model is based on the company's historical data, relating the characteristics of customers in relation to the established contract and cancellation occurrences. It establishes that what is recommended for this problem is the use of PySpark, since the marketing team has a large volume of data about customers. PySpark is the only tool capable of processing all this data and making our research possible.