This project is specifically focused on data visualization and feature selection. For data visualization I have used violin plot and swarm plot, and for feature selection I have used heatmap to finding the correlation between the features. I also used 3 different feature selection processes:
- Feature selection with correlation and random forest classification
- RFECV and random forest classification
- Tree based feature selection and random forest classification
I got the best result in the very first case, i.e. Feature selection with correlation and random forest classification where I got an accuracy of 96.49%