Drug Classification with Machine Learning

This project demonstrates a drug classification problem using various machine learning techniques. The following key machine learning concepts and libraries are covered:

Pandas: Used for data manipulation and analysis.
DecisionTreeClassifier: A supervised learning algorithm for classification tasks.
KNeighborsClassifier: Another supervised learning method that classifies based on the 'k' nearest neighbors.
OneHotEncoder: Transforms categorical variables into binary form.
Matplotlib-Pyplot: A plotting library for creating visualizations.
ColumnTransformer: Allows multiple preprocessing steps to be applied on specific columns.
Train-Test-Split: A method to split the dataset into training and testing sets.
sklearn-tree: Part of scikit-learn for decision tree models.
sklearn-model: Scikit-learn's interface for model training and testing.
sklearn-preprocessing: Tools for preprocessing data (e.g., normalization, encoding).
sklearn-compose: A module to streamline column transformations.

Dataset

We use a drug classification dataset which contains information about patients and drugs prescribed to them. The dataset can be found at the following link:

Drug Classification Dataset

The data includes attributes such as age, sex, blood pressure levels, cholesterol levels, and the drug prescribed.

Notebook

The notebook used to implement the solution for this project is available on Kaggle and demonstrates how to load, preprocess, and build machine learning models to classify drugs.

Drug Classification Notebook

Methodology

The process for this project involves the following steps:

Data Preprocessing:
- Using Pandas to explore and clean the dataset.
- Encoding categorical variables with OneHotEncoder.
- Splitting the dataset into training and testing sets using train_test_split.
Model Building:
- Implementing decision tree and k-nearest neighbor classifiers using DecisionTreeClassifier and KNeighborsClassifier from scikit-learn.
- Using ColumnTransformer to handle different data preprocessing steps for numerical and categorical features.
Model Evaluation:
- Assessing the models' performance with accuracy and other metrics.
- Visualizing results using matplotlib-pyplot.

Libraries

To run this project, the following Python libraries are required:

pandas
scikit-learn
matplotlib

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
ml-drug-classification (1).ipynb		ml-drug-classification (1).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drug Classification with Machine Learning

Dataset

Notebook

Methodology

Libraries

About

Releases

Packages

Languages

AhmadSaad310/ML-Drug-Classification

Folders and files

Latest commit

History

Repository files navigation

Drug Classification with Machine Learning

Dataset

Notebook

Methodology

Libraries

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages