Data Analysis Project - Diabetes Classifier

Members

To view the project live, please click on this link here.

Description

Our project aims to identify and understand the factors that causes diabetes (our dependent variable) and be able to develop a sound method to predict whether a person is suffering from diabetes based on the different parameters and factors given to us. We first start off with data collection from a given dataset which contains information about potential factors that causes diabetes including age, glucose_concentration and blood_pressure.

We then continued our project by conducting some data pre-processing and data cleaning to deal with data outliers in order to ensure the outliers do not affect our data analysis. Following which, we used exploratory data analysis (EDA) techniques to find out any possible correlations between the factors and diabetes classification. We did so by doing a univariate analysis for each factor. We then did feature selection on the factors to only use factors that were highly correlated to diabetes classification factor.

Finally, we tested these factors out by building 3 models, namely - Logistic Regression model, K-Nearest Neighbours model and Random Forest Classifier model. We tested our models against the sample data provided, and used the Accuracy and F1-score metrics to evaluate our model.

Files Used

Usage

Clone the repo
git clone ...
Open Diabetes Predictor.ipynb in your local jupyter notebook server. Do note that the following python packages need to be installed beforehand:

numpy
pandas
matplotlib
seaborn
sklearn

Original dataset can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Diabetes Predictor.ipynb		Diabetes Predictor.ipynb
README.md		README.md
diabetes_predictor.py		diabetes_predictor.py
requirements.txt		requirements.txt
submission.csv		submission.csv
test.csv		test.csv
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis Project - Diabetes Classifier

Description

Files Used

Usage

About

Releases

Packages

Languages

xbowery/diabetes-classifier

Folders and files

Latest commit

History

Repository files navigation

Data Analysis Project - Diabetes Classifier

Description

Files Used

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages