Skip to content

Flow-Glow/K-means

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

K Means Algorithm

What is K Means

This algorithm is an iterative algorithm that partitions the dataset according to their features into K number of predefined non- overlapping distinct clusters or subgroups. It makes the data points of inter clusters as similar as possible and also tries to keep the clusters as far as possible. It allocates the data points to a cluster if the sum of the squared distance between the cluster’s centroid and the data points is at a minimum, where the cluster’s centroid is the arithmetic mean of the data points that are in the cluster. A less variation in the cluster results in similar or homogeneous data points within the cluster.

Sources :

  1. Nvidia
  2. Wikipedia

How K Means works

  1. Specify number of clusters K.
  2. Initialize centroids by first shuffling the dataset and then randomly selecting K data points for the centroids without replacement.
  3. Keep iterating until there is no change to the centroids. i.e assignment of data points to clusters isn’t changing.
  4. Compute the euclidean distance
  5. Assign each data point to the closest cluster (centroid).
  6. Compute the centroids for the clusters by taking the average of the all data points that belong to each cluster.

Flow Chart

picture alt


K Means in action

2D:

picture alt

3D:

picture alt

About

K-means clustering is a method used for clustering analysis, especially in data mining and statistics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages