Build a Clustering Model to Perform a Customer Geolocation Data Clustering with K-Means Algorithm

Notes

Clone the project on your machine with :

git clone https://github.com/zakariamejdoul/neural_style_transfer_pytorch.git

Behaviour

What is Clustering ?

Clustering is the work to separate population or data points into various categories. Data points are closer to other data points in the same category and distinct from data points in other categories. It’s essentially a collection of objects based on their similarity and difference.

What type of problem clustering can be solved ?

Clustering algorithms are an effective Machine Learning (ML) technique for unsupervised data (unlabeled data). The most popular algorithms for ML are K-Means clustering. This algorithm is extremely efficient when applied to many ML problems.

The K-Means clustering has been applied to different scenarios in many different problems area, such as:

Information Technology: used to identify the spam filter, classify network traffic, and identify fraudulent or criminal activity.
Marketing: used to characterize & discover customer segments for marketing purposes.
Biology: used for classification among different species of plants and animals.
Insurance: used to acknowledge the customers, their policies and identifying the frauds.

Clustering for geolocation data

We are using our customer geolocation data to perform a clustering algorithm to get several clusters in which the member data of each cluster are closest to each other using KMeans and Constrained KMeans which has a parameter to restrict the number’s member of each cluster. We assume each cluster contains the parcel to which the driver should be delivered. So the driver should be travel in a certain closet area only.

This picture showed the flow process when we were dealing with geolocation data. Since we have our customer's address, we need to convert it into latitude and longitude information. We need a few steps to use the GeoPandas API, which will explain in the Geocoding section of the Notebook.

Steps

The notebook of project is divided on parts that are :

Geocoding : From Address to Longitude & Latitude
Import Geolocation Data
K-Means Model & Training
Clustering with Constrained Problem
Visualization of the Result

Results

The result can be shown below :

Displaying KMeans Clustering Results :
Displaying Constrained KMeans Clustering Results :

Resources

Author

Zakaria Mejdoul

Enjoy Clustering and Visualizing your Customers Geolocation Data ❗ 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
dataset		dataset
static		static
LICENSE		LICENSE
README.md		README.md
geolocation_data_clustering.ipynb		geolocation_data_clustering.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Build a Clustering Model to Perform a Customer Geolocation Data Clustering with K-Means Algorithm

Notes

Behaviour

What is Clustering ?

What type of problem clustering can be solved ?

Clustering for geolocation data

Steps

Results

Resources

Author

About

Releases

Packages

Languages

License

zakariamejdoul/customer_geolocation_data_clustering

Folders and files

Latest commit

History

Repository files navigation

Build a Clustering Model to Perform a Customer Geolocation Data Clustering with K-Means Algorithm

Notes

Behaviour

What is Clustering ?

What type of problem clustering can be solved ?

Clustering for geolocation data

Steps

Results

Resources

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages