Evaluation of Logistic Regression, Random Forest, and Support Vector Machine Models for Predicting Stroke Risk

This repository evaluate three machine learning models - Logistic Regression, Random Forest, and Support Vector Machine (SVM) - for predicting stroke risk. The project was implemented in Python, utilizing various libraries and techniques for data pre-processing and performance evaluation.

Project Overview

The objective of this project is to compare the performance of Logistic Regression, Random Forest, and SVM models in predicting stroke risk. The dataset used in this study underwent extensive data pre-processing, including handling missing values, variable conversion, and data scaling. Additionally, SMOTE (Synthetic Minority Over-sampling Technique) was employed to address the imbalanced nature of the dataset.

Library Dependencies

The following Python libraries were utilized in this project:

numpy for numerical operations.
pandas for data manipulation and analysis.
seaborn and matplotlib.pyplot for data visualization.
sklearn.preprocessing for label encoding and data scaling.
imblearn.over_sampling for applying SMOTE.
sklearn.ensemble.RandomForestClassifier for implementing the Random Forest model.
sklearn.linear_model.LogisticRegression for implementing the Logistic Regression model.
sklearn.svm.SVC for implementing the Support Vector Machine (SVM) model.
sklearn.model_selection for train-test splitting and cross-validation.
sklearn.metrics for performance evaluation, including accuracy, confusion matrix, and classification report.

Please ensure that these libraries are installed in your Python environment before running the project.

Dataset

The dataset used for this project contains relevant features for predicting stroke risk. the specific dataset used is included in the repository.

Implementation

The project code can be found in the provided Jupyter Notebook. The implementation includes data pre-processing steps, model training and evaluation, and performance metric calculations. The models were evaluated using various performance metrics, including accuracy, confusion matrix, and classification report.

Contributions

Contributions to this project are welcome. If you would like to contribute, please follow the standard GitHub workflow of creating a fork, making changes in a branch, and submitting a pull request. Be sure to include a detailed description of the changes and any relevant documentation updates.

License

This project is licensed under the MIT License. You are free to modify, distribute, and use the code and resources in this repository according to the terms of the license.

Contact

For any questions or inquiries related to this project, please contact the project owner:

Name: [Fatai Azeez]
Email: [fatai.azeez28@gmail.com]

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dataset		Dataset
LICENSE		LICENSE
README.md		README.md
stroke_prediction.ipynb		stroke_prediction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluation of Logistic Regression, Random Forest, and Support Vector Machine Models for Predicting Stroke Risk

Project Overview

Library Dependencies

Dataset

Implementation

Contributions

License

Contact

About

Releases

Packages

Languages

License

FataiAzeez/stroke_prediction_rf_svm_lr

Folders and files

Latest commit

History

Repository files navigation

Evaluation of Logistic Regression, Random Forest, and Support Vector Machine Models for Predicting Stroke Risk

Project Overview

Library Dependencies

Dataset

Implementation

Contributions

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages