Skip to content

Latest commit

 

History

History
18 lines (10 loc) · 717 Bytes

README.md

File metadata and controls

18 lines (10 loc) · 717 Bytes

This analysis focuses on finding the best predictive model for classifying clients who might subscribe to a term deposit. Models included are Logistic Regression, Random Forest and Neural Networks.

Data Notes:

  • clean_data.csv = data in the initial format
  • selected_data = data after 2009-04-01 ( more stable ratio of subscribers)
  • clean_data = data with capped outliers

NOTE: some categorical features were adjusted in the feature_adjustments script

-validation_set = most recent data, after 2010-09-01, used for model validation -training_set = 70% of the remaining data, before sampling -test_set = 30% of the remaining data, used for testing the models

-balanced_training_set = training set after SMOTE