Class FuzzyKMeansClusterer<T extends Clusterable>

  • Type Parameters:
    T - type of the points to cluster

    public class FuzzyKMeansClusterer<T extends Clusterable>
    extends Clusterer<T>
    Fuzzy K-Means clustering algorithm.

    The Fuzzy K-Means algorithm is a variation of the classical K-Means algorithm, with the major difference that a single data point is not uniquely assigned to a single cluster. Instead, each point i has a set of weights uij which indicate the degree of membership to the cluster j.

    The algorithm then tries to minimize the objective function: \[ J = \sum_{i=1}^C\sum_{k=1]{N} u_{i,k}^m d_{i,k}^2 \] with \(d_{i,k}\) being the distance between data point i and the cluster center k.

    The algorithm requires two parameters:

    • k: the number of clusters
    • fuzziness: determines the level of cluster fuzziness, larger values lead to fuzzier clusters

    Additional, optional parameters:

    • maxIterations: the maximum number of iterations
    • epsilon: the convergence criteria, default is 1e-3

    The fuzzy variant of the K-Means algorithm is more robust with regard to the selection of the initial cluster centers.

    • Constructor Detail

      • FuzzyKMeansClusterer

        public FuzzyKMeansClusterer​(int k,
                                    double fuzziness)
                             throws MathIllegalArgumentException
        Creates a new instance of a FuzzyKMeansClusterer.

        The euclidean distance will be used as default distance measure.

        Parameters:
        k - the number of clusters to split the data into
        fuzziness - the fuzziness factor, must be > 1.0
        Throws:
        MathIllegalArgumentException - if fuzziness <= 1.0
      • FuzzyKMeansClusterer

        public FuzzyKMeansClusterer​(int k,
                                    double fuzziness,
                                    int maxIterations,
                                    DistanceMeasure measure)
                             throws MathIllegalArgumentException
        Creates a new instance of a FuzzyKMeansClusterer.
        Parameters:
        k - the number of clusters to split the data into
        fuzziness - the fuzziness factor, must be > 1.0
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
        measure - the distance measure to use
        Throws:
        MathIllegalArgumentException - if fuzziness <= 1.0
      • FuzzyKMeansClusterer

        public FuzzyKMeansClusterer​(int k,
                                    double fuzziness,
                                    int maxIterations,
                                    DistanceMeasure measure,
                                    double epsilon,
                                    RandomGenerator random)
                             throws MathIllegalArgumentException
        Creates a new instance of a FuzzyKMeansClusterer.
        Parameters:
        k - the number of clusters to split the data into
        fuzziness - the fuzziness factor, must be > 1.0
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
        measure - the distance measure to use
        epsilon - the convergence criteria (default is 1e-3)
        random - random generator to use for choosing initial centers
        Throws:
        MathIllegalArgumentException - if fuzziness <= 1.0
    • Method Detail

      • getK

        public int getK()
        Return the number of clusters this instance will use.
        Returns:
        the number of clusters
      • getFuzziness

        public double getFuzziness()
        Returns the fuzziness factor used by this instance.
        Returns:
        the fuzziness factor
      • getMaxIterations

        public int getMaxIterations()
        Returns the maximum number of iterations this instance will use.
        Returns:
        the maximum number of iterations, or -1 if no maximum is set
      • getEpsilon

        public double getEpsilon()
        Returns the convergence criteria used by this instance.
        Returns:
        the convergence criteria
      • getRandomGenerator

        public RandomGenerator getRandomGenerator()
        Returns the random generator this instance will use.
        Returns:
        the random generator
      • getMembershipMatrix

        public RealMatrix getMembershipMatrix()
        Returns the nxk membership matrix, where n is the number of data points and k the number of clusters.

        The element Ui,j represents the membership value for data point i to cluster j.

        Returns:
        the membership matrix
        Throws:
        MathIllegalStateException - if cluster(Collection) has not been called before
      • getDataPoints

        public List<T> getDataPoints()
        Returns an unmodifiable list of the data points used in the last call to cluster(Collection).
        Returns:
        the list of data points, or null if cluster(Collection) has not been called before.
      • getObjectiveFunctionValue

        public double getObjectiveFunctionValue()
        Get the value of the objective function.
        Returns:
        the objective function evaluation as double value
        Throws:
        MathIllegalStateException - if cluster(Collection) has not been called before