Class EnumeratedDistribution<T>

  • Type Parameters:
    T - type of the elements in the sample space.
    All Implemented Interfaces:
    Serializable

    public class EnumeratedDistribution<T>
    extends Object
    implements Serializable
    A generic implementation of a discrete probability distribution (Wikipedia) over a finite sample space, based on an enumerated list of <value, probability> pairs.

    Input probabilities must all be non-negative, but zero values are allowed and their sum does not have to equal one. Constructors will normalize input probabilities to make them sum to one.

    The list of <value, probability> pairs does not, strictly speaking, have to be a function and it can contain null values. The pmf created by the constructor will combine probabilities of equal values and will treat null values as equal.

    For example, if the list of pairs <"dog", 0.2>, <null, 0.1>, <"pig", 0.2>, <"dog", 0.1>, <null, 0.4> is provided to the constructor, the resulting pmf will assign mass of 0.5 to null, 0.3 to "dog" and 0.2 to null.

    See Also:
    Serialized Form
    • Constructor Detail

      • EnumeratedDistribution

        public EnumeratedDistribution​(List<Pair<T,​Double>> pmf)
                               throws MathIllegalArgumentException
        Create an enumerated distribution using the given probability mass function enumeration.
        Parameters:
        pmf - probability mass function enumerated as a list of <T, probability> pairs.
        Throws:
        MathIllegalArgumentException - of weights includes negative, NaN or infinite values or only 0's
    • Method Detail

      • probability

        public double probability​(T x)
        For a random variable X whose values are distributed according to this distribution, this method returns P(X = x). In other words, this method represents the probability mass function (PMF) for the distribution.

        Note that if x1 and x2 satisfy x1.equals(x2), or both are null, then probability(x1) = probability(x2).

        Parameters:
        x - the point at which the PMF is evaluated
        Returns:
        the value of the probability mass function at x
      • getPmf

        public List<Pair<T,​Double>> getPmf()
        Return the probability mass function as a list of (value, probability) pairs.

        Note that if duplicate and / or null values were provided to the constructor when creating this EnumeratedDistribution, the returned list will contain these values. If duplicates values exist, what is returned will not represent a pmf (i.e., it is up to the caller to consolidate duplicate mass points).

        Returns:
        the probability mass function.
      • checkAndNormalize

        public static double[] checkAndNormalize​(double[] weights)
        Checks to make sure that weights is neither null nor empty and contains only non-negative, finite, non-NaN values and if necessary normalizes it to sum to 1.
        Parameters:
        weights - input array to be used as the basis for the values of a PMF
        Returns:
        a possibly rescaled copy of the array that sums to 1 and contains only valid probability values
        Throws:
        MathIllegalArgumentException - of weights is null or empty or includes negative, NaN or infinite values or only 0's