Skip to content

This project utilizes PCA, Hierarchical Clustering, K-Means, and KNN to cluster customers for targeted marketing. Kaggle dataset employed.

Notifications You must be signed in to change notification settings

krishnavbajoria02/Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Clustering

This project utilizes PCA, Hierarchical Clustering, K-Means, and KNN to cluster customers for targeted marketing. Kaggle dataset employed.

Customer Segmentation Clustering Project

Overview

This project focuses on customer segmentation, a crucial task in marketing and business analysis. By employing various clustering techniques, including Principal Component Analysis (PCA), Hierarchical Clustering, K-Means, and K-Nearest Neighbors (KNN), we aim to group customers with similar characteristics and behaviors. These clusters can provide valuable insights for targeted marketing and improved customer service.

Dataset

We utilized a Kaggle dataset containing customer-related data, such as demographics, purchase history, and behavioral attributes. The dataset (https://www.kaggle.com/dataset-link) is available on Kaggle and contains essential information for our clustering analysis.

Techniques Used

  1. Principal Component Analysis (PCA): We applied PCA to reduce the dimensionality of the dataset while preserving as much variance as possible. This technique simplifies the data for subsequent clustering algorithms.

  2. Hierarchical Clustering: Hierarchical clustering is used to create a dendrogram that visualizes the hierarchical relationships between data points. This approach helps identify natural clusters within the data.

  3. K-Means Clustering: K-Means is a partitioning method that groups data points into K clusters based on similarity. We experimented with different values of K to find the optimal number of clusters.

  4. K-Nearest Neighbors (KNN): KNN is employed for customer classification based on similarity to other customers. This method allows for customer grouping and recommendation.

Project Structure

The repository is organized as follows:

  • CC_GENERAL/: Contains the dataset used for customer segmentation.
  • Clustering_Project.ipynb/: Jupyter notebooks detailing the step-by-step process of data exploration, preprocessing, feature engineering, and clustering using PCA, Hierarchical Clustering, K-Means, and KNN.

About

This project utilizes PCA, Hierarchical Clustering, K-Means, and KNN to cluster customers for targeted marketing. Kaggle dataset employed.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published