Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fruit Classification using PCA and Various Classifiers #875

Merged
merged 3 commits into from
Nov 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions Prediction Models/Friut_Classification_model/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Fruit Classification using PCA and Various Classifiers

## Project Description

This project is centered around the classification of different types of fruit images using Principal Component Analysis (PCA) for dimensionality reduction and applying various machine learning algorithms for classification, such as:

- Support Vector Machine (SVM)
- k-Nearest Neighbors (KNN)
- Decision Tree Classifier

The combination of PCA for feature reduction and classifiers helps streamline the computational complexity while maintaining high classification accuracy.

### Key Technologies Used:
- **Python**: Programming language used for implementation.
- **PCA (Principal Component Analysis)**: Reduces the dimensionality of the dataset to improve performance.
- **SVM**: A powerful algorithm for classification tasks that works by finding the optimal hyperplane to separate classes.
- **KNN**: A simple, intuitive classifier that categorizes data points based on their nearest neighbors.
- **Decision Tree**: A model that splits the dataset based on feature values for classification.

### Libraries and Frameworks:
- **NumPy & Pandas**: For data manipulation and preprocessing.
- **Scikit-learn**: For implementing PCA, SVM, KNN, and Decision Tree classifiers.
- **Matplotlib & Seaborn**: For visualizing data and results.

## Problem Statement

The project aims to address the challenges associated with high-dimensional data in image classification, such as:

- **High Computational Cost**: Processing large image datasets can be computationally intensive.
- **Feature Redundancy**: High-dimensional data often includes redundant features that do not contribute to model accuracy.
- **Model Overfitting**: Increased complexity can lead to overfitting, where the model performs well on training data but poorly on unseen data.

By applying PCA, this project focuses on retaining the most informative features while reducing dimensionality, thus balancing performance and efficiency. The classifiers are evaluated based on metrics such as accuracy, precision, recall, and F1 score to determine the most effective model for fruit image classification.

## Project Structure

- `data/`: Contains the dataset used for training and testing.
- `src/`: Includes scripts for data preprocessing, feature extraction, and training models.
- `notebooks/`: Jupyter notebooks for interactive analysis and visualization.
- `README.md`: Project documentation.
- `results/`: Saved model performance metrics and plots.


Large diffs are not rendered by default.

Loading