Class KMeansPlusPlusClusterer<T extends Clusterable>

  • Type Parameters:
    T - type of the points to cluster

    public class KMeansPlusPlusClusterer<T extends Clusterable>
    extends Clusterer<T>
    Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.
    See Also:
    K-means++ (wikipedia)
    • Constructor Detail

      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        The euclidean distance will be used as default distance measure.

        Parameters:
        k - the number of clusters to split the data into
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        The euclidean distance will be used as default distance measure.

        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations,
                                       DistanceMeasure measure)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
        measure - the distance measure to use
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations,
                                       DistanceMeasure measure,
                                       RandomGenerator random)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
        measure - the distance measure to use
        random - random generator to use for choosing initial centers
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations,
                                       DistanceMeasure measure,
                                       RandomGenerator random,
                                       KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
        Build a clusterer.
        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
        measure - the distance measure to use
        random - random generator to use for choosing initial centers
        emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations