Skip to content

DolbyUUU/clustering_algorithm_implementation_python

Repository files navigation

Clustering algorithm implementaion and visualization from scratch with python

Four popular clustering algorithms (for d >= 2 dimensions, k >= 2 clusters):

  • (1) k-means clustering
  • (2) Gaussian mixture model - expectation maximization algorithm (EM-GMM)
  • (3) mean-shift clustering
  • (4) agglomerative clustering

Python implementations:

  • KMeans.py: k-means clustering
  • KMeans_Ver0.py: second version of k-means implementation as a function
  • GaussianMM.py: EM-GMM
  • GaussianMM_Ver0.py: second version of EM-GMM implementation with functions of AIC, BIC and predict
  • MeanShift.py: mean-shift clustering
  • Agglomerative: agglomerative clustering

Evaluations and tests:

  • test_2d_visualization.py: tests on 2D datasets with visualization, compared with sklearn implementation
  • data_2d_test folder: datasets for tests
  • test_2d_visualization_results folder: output images of tests

Visualization Results

The following figures compare the clustering results of my own implementations with those of scikit-learn's implementations. Each dataset is processed using different clustering algorithms.


Blobs Dataset

Algorithm My Implementation Scikit-learn
Agglomerative Blobs - Agglomerative (My) Blobs - Agglomerative (Sklearn)
EM-GMM Blobs - EM-GMM (My) Blobs - EM-GMM (Sklearn)
K-Means Blobs - K-Means (My) Blobs - K-Means (Sklearn)
Mean-Shift Blobs - Mean-Shift (My) Blobs - Mean-Shift (Sklearn)

Moons and Stars Dataset

Algorithm My Implementation Scikit-learn
Agglomerative Moons and Stars - Agglomerative (My) Moons and Stars - Agglomerative (Sklearn)
EM-GMM Moons and Stars - EM-GMM (My) Moons and Stars - EM-GMM (Sklearn)
K-Means Moons and Stars - K-Means (My) Moons and Stars - K-Means (Sklearn)
Mean-Shift Moons and Stars - Mean-Shift (My) Moons and Stars - Mean-Shift (Sklearn)

Sticks Dataset

Algorithm My Implementation Scikit-learn
Agglomerative Sticks - Agglomerative (My) Sticks - Agglomerative (Sklearn)
EM-GMM Sticks - EM-GMM (My) Sticks - EM-GMM (Sklearn)
K-Means Sticks - K-Means (My) Sticks - K-Means (Sklearn)
Mean-Shift Sticks - Mean-Shift (My) Sticks - Mean-Shift (Sklearn)

About

Clustering algorithm implementaions from scratch with python (k-means, EM-GMM, mean-shift, agglomerative)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages