kMeans

k means algorithm from scratch

I have implement a function to calculate the k-means clustering of a dataframe of 'm' observations with 'n' attributes.

The implementation function has the signature:

    cluster_kmeans(df, k)

where

   df is a Pandas dataframe of m observations with n attributes; m rows with n columns (excluding index;
   the index will contain the label for each observation and is not considered an attribute)
   
   k is the number of clusters to find

Output

The function should return a new dataframe with a single column: the cluster label for each observation. It should also return the last Sum-of-the-Square-Errors (SSE) from the clustering.

For the proximity measure, Euclidean distance is used as the metric. Sum-of-the-Square-Errors is the objective function.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
kmeans.ipynb		kmeans.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kMeans

Output

About

Releases

Packages

Languages

lsrinidhi17/kMeans

Folders and files

Latest commit

History

Repository files navigation

kMeans

Output

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages