Bisecting Kmeans Linkage Matrix Output (Cluster Indices)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Bisecting Kmeans Linkage Matrix Output (Cluster Indices)

I have been working on a project to return a Linkage Matrix output from the
Spark Bisecting Kmeans Algorithm output so that it is possible to plot the
selection steps in a dendogram. I am having trouble returning valid Indices
when I use more than 3-4 clusters in the algorithm and am hoping someone
else might have the time/interest enough to take a look.

To achieve this I made some modifications to the Bisecting Kmeans algorithm
to produce a z-linkage matrix based on yu-iskw's work. I also made some
modifications to provide more information about the selection steps in the
Bisecting Kmeans Algorithm to the log at run-time.

Test outputs using the Iris Dataset with both k = 3 and k = 10 clusters can
be seen on  my stack overflow post

The project so far (with a simple sbt build and the compiled jars) can also
be seen on  my github repo
<>  and is also detailed in
the aforementioned stack overflow post.

Sent from:

To unsubscribe e-mail: [hidden email]