org.hipparchus.stat.descriptive.rank

## Class Percentile

• All Implemented Interfaces:
Serializable, UnivariateStatistic, MathArrays.Function

```public class Percentile
extends AbstractUnivariateStatistic
implements Serializable```
Provides percentile computation.

There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:

1. Let `n` be the length of the (sorted) array and `0 < p <= 100` be the desired percentile.
2. If ` n = 1 ` return the unique array element (regardless of the value of `p`); otherwise
3. Compute the estimated percentile position ` pos = p * (n + 1) / 100` and the difference, `d` between `pos` and `floor(pos)` (i.e. the fractional part of `pos`).
4. If `pos < 1` return the smallest element in the array.
5. Else if `pos >= n` return the largest element in the array.
6. Else let `lower` be the element in position `floor(pos)` in the array and let `upper` be the next element in the array. Return `lower + d * (upper - lower)`

To compute percentiles, the data must be at least partially ordered. Input arrays are copied and recursively partitioned using an ordering definition. The ordering used by `Arrays.sort(double[])` is the one determined by `Double.compareTo(Double)`. This ordering makes `Double.NaN` larger than any other value (including `Double.POSITIVE_INFINITY`). Therefore, for example, the median (50th percentile) of `{0, 1, 2, 3, 4, Double.NaN}` evaluates to `2.5.`

Since percentile estimation usually involves interpolation between array elements, arrays containing `NaN` or infinite values will often result in `NaN` or infinite values returned.

Further, to include different estimation types such as R1, R2 as mentioned in Quantile page(wikipedia), a type specific NaN handling strategy is used to closely match with the typically observed results from popular tools like R(R1-R9), Excel(R7).

Percentile uses only selection instead of complete sorting and caches selection algorithm state between calls to the various `evaluate` methods. This greatly improves efficiency, both for a single percentile and multiple percentile computations. To maximize performance when multiple percentiles are computed based on the same data, users should set the data array once using either one of the `evaluate(double[], double)` or `setData(double[])` methods and thereafter `evaluate(double)` with just the percentile provided.

Note that this implementation is not synchronized. If multiple threads access an instance of this class concurrently, and at least one of the threads invokes the `increment()` or `clear()` method, it must be synchronized externally.

See Also:
Serialized Form
• ### Nested Class Summary

Nested Classes
Modifier and Type Class and Description
`static class ` `Percentile.EstimationType`
An enum for various estimation strategies of a percentile referred in wikipedia on quantile with the names of enum matching those of types mentioned in wikipedia.
• ### Constructor Summary

Constructors
Modifier Constructor and Description
` ` `Percentile()`
Constructs a Percentile with the following defaults.
` ` `Percentile(double quantile)`
Constructs a Percentile with the specific quantile value and the following default method type: `Percentile.EstimationType.LEGACY` default NaN strategy: `NaNStrategy.REMOVED` a Kth Selector : `KthSelector`
`protected ` ```Percentile(double quantile, Percentile.EstimationType estimationType, NaNStrategy nanStrategy, KthSelector kthSelector)```
Constructs a Percentile with the specific quantile value, `Percentile.EstimationType`, `NaNStrategy` and `KthSelector`.
` ` `Percentile(Percentile original)`
Copy constructor, creates a new `Percentile` identical to the `original`
• ### Method Summary

All Methods
Modifier and Type Method and Description
`Percentile` `copy()`
Returns a copy of the statistic with the same internal state.
`double` `evaluate(double p)`
Returns the result of evaluating the statistic over the stored data.
`double` ```evaluate(double[] values, double p)```
Returns an estimate of the `p`th percentile of the values in the `values` array.
`double` ```evaluate(double[] values, int start, int length)```
Returns an estimate of the `quantile`th percentile of the designated values in the `values` array.
`double` ```evaluate(double[] values, int begin, int length, double p)```
Returns an estimate of the `p`th percentile of the values in the `values` array, starting with the element in (0-based) position `begin` in the array and including `length` values.
`Percentile.EstimationType` `getEstimationType()`
Get the estimation `type` used for computation.
`KthSelector` `getKthSelector()`
Get the `kthSelector` used for computation.
`NaNStrategy` `getNaNStrategy()`
Get the `NaN Handling` strategy used for computation.
`PivotingStrategy` `getPivotingStrategy()`
Get the `PivotingStrategy` used in KthSelector for computation.
`double` `getQuantile()`
Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
`protected double[]` ```getWorkArray(double[] values, int begin, int length)```
Get the work array to operate.
`void` `setData(double[] values)`
Set the data array.
`void` ```setData(double[] values, int begin, int length)```
Set the data array.
`void` `setQuantile(double p)`
Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
`Percentile` `withEstimationType(Percentile.EstimationType newEstimationType)`
Build a new instance similar to the current one except for the `estimation type`.
`Percentile` `withKthSelector(KthSelector newKthSelector)`
Build a new instance similar to the current one except for the `kthSelector` instance specifically set.
`Percentile` `withNaNStrategy(NaNStrategy newNaNStrategy)`
Build a new instance similar to the current one except for the `NaN handling` strategy.
• ### Methods inherited from class org.hipparchus.stat.descriptive.AbstractUnivariateStatistic

`evaluate, getData, getDataRef`
• ### Methods inherited from class java.lang.Object

`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`
• ### Methods inherited from interface org.hipparchus.stat.descriptive.UnivariateStatistic

`evaluate`
• ### Method Detail

• #### setData

`public void setData(double[] values)`
Set the data array.

The stored value is a copy of the parameter array, not the array itself.

Overrides:
`setData` in class `AbstractUnivariateStatistic`
Parameters:
`values` - data array to store (may be null to remove stored data)
See Also:
`AbstractUnivariateStatistic.evaluate()`
• #### setData

```public void setData(double[] values,
int begin,
int length)
throws MathIllegalArgumentException```
Set the data array. The input array is copied, not referenced.
Overrides:
`setData` in class `AbstractUnivariateStatistic`
Parameters:
`values` - data array to store
`begin` - the index of the first element to include
`length` - the number of elements to include
Throws:
`MathIllegalArgumentException` - if values is null or the indices are not valid
See Also:
`AbstractUnivariateStatistic.evaluate()`
• #### evaluate

```public double evaluate(double p)
throws MathIllegalArgumentException```
Returns the result of evaluating the statistic over the stored data.

The stored array is the one which was set by previous calls to `setData(double[])`

Parameters:
`p` - the percentile value to compute
Returns:
the value of the statistic applied to the stored data
Throws:
`MathIllegalArgumentException` - if p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)
• #### evaluate

```public double evaluate(double[] values,
int start,
int length)
throws MathIllegalArgumentException```
Returns an estimate of the `quantile`th percentile of the designated values in the `values` array. The quantile estimated is determined by the `quantile` property.

• Returns `Double.NaN` if `length = 0`
• Returns (for any value of `quantile`) `values[begin]` if `length = 1 `
• Throws `MathIllegalArgumentException` if `values` is null, or `start` or `length` is invalid

See `Percentile` for a description of the percentile estimation algorithm used.

Specified by:
`evaluate` in interface `UnivariateStatistic`
Specified by:
`evaluate` in interface `MathArrays.Function`
Specified by:
`evaluate` in class `AbstractUnivariateStatistic`
Parameters:
`values` - the input array
`start` - index of the first array element to include
`length` - the number of elements to include
Returns:
the percentile value
Throws:
`MathIllegalArgumentException` - if the parameters are not valid
• #### evaluate

```public double evaluate(double[] values,
double p)
throws MathIllegalArgumentException```
Returns an estimate of the `p`th percentile of the values in the `values` array.

• Returns `Double.NaN` if `values` has length `0`
• Returns (for any value of `p`) `values` if `values` has length `1`
• Throws `MathIllegalArgumentException` if `values` is null or p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)

The default implementation delegates to `evaluate(double[], int, int, double)` in the natural way.

Parameters:
`values` - input array of values
`p` - the percentile value to compute
Returns:
the percentile value or Double.NaN if the array is empty
Throws:
`MathIllegalArgumentException` - if `values` is null or p is invalid
• #### evaluate

```public double evaluate(double[] values,
int begin,
int length,
double p)
throws MathIllegalArgumentException```
Returns an estimate of the `p`th percentile of the values in the `values` array, starting with the element in (0-based) position `begin` in the array and including `length` values.

Calls to this method do not modify the internal `quantile` state of this statistic.

• Returns `Double.NaN` if `length = 0`
• Returns (for any value of `p`) `values[begin]` if `length = 1 `
• Throws `MathIllegalArgumentException` if `values` is null , `begin` or `length` is invalid, or `p` is not a valid quantile value (p must be greater than 0 and less than or equal to 100)

See `Percentile` for a description of the percentile estimation algorithm used.

Parameters:
`values` - array of input values
`p` - the percentile to compute
`begin` - the first (0-based) element to include in the computation
`length` - the number of array elements to include
Returns:
the percentile value
Throws:
`MathIllegalArgumentException` - if the parameters are not valid or the input array is null
• #### getQuantile

`public double getQuantile()`
Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
Returns:
quantile set while construction or `setQuantile(double)`
• #### setQuantile

```public void setQuantile(double p)
throws MathIllegalArgumentException```
Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
Parameters:
`p` - a value between 0 < p <= 100
Throws:
`MathIllegalArgumentException` - if p is not greater than 0 and less than or equal to 100
• #### copy

`public Percentile copy()`
Returns a copy of the statistic with the same internal state.
Specified by:
`copy` in interface `UnivariateStatistic`
Specified by:
`copy` in class `AbstractUnivariateStatistic`
Returns:
a copy of the statistic
• #### getWorkArray

```protected double[] getWorkArray(double[] values,
int begin,
int length)```
Get the work array to operate. Makes use of prior `storedData` if it exists or else do a check on NaNs and copy a subset of the array defined by begin and length parameters. The set `nanStrategy` will be used to either retain/remove/replace any NaNs present before returning the resultant array.
Parameters:
`values` - the array of numbers
`begin` - index to start reading the array
`length` - the length of array to be read from the begin index
Returns:
work array sliced from values in the range [begin,begin+length)
Throws:
`MathIllegalArgumentException` - if values or indices are invalid
• #### getEstimationType

`public Percentile.EstimationType getEstimationType()`
Get the estimation `type` used for computation.
Returns:
the `estimationType` set
• #### withEstimationType

`public Percentile withEstimationType(Percentile.EstimationType newEstimationType)`
Build a new instance similar to the current one except for the `estimation type`.

This method is intended to be used as part of a fluent-type builder pattern. Building finely tune instances should be done as follows:

```   Percentile customized = new Percentile(quantile).
withEstimationType(estimationType).
withNaNStrategy(nanStrategy).
withKthSelector(kthSelector);
```

If any of the `withXxx` method is omitted, the default value for the corresponding customization parameter will be used.

Parameters:
`newEstimationType` - estimation type for the new instance
Returns:
a new instance, with changed estimation type
Throws:
`NullArgumentException` - when newEstimationType is null
• #### getNaNStrategy

`public NaNStrategy getNaNStrategy()`
Get the `NaN Handling` strategy used for computation.
Returns:
`NaN Handling` strategy set during construction
• #### withNaNStrategy

`public Percentile withNaNStrategy(NaNStrategy newNaNStrategy)`
Build a new instance similar to the current one except for the `NaN handling` strategy.

This method is intended to be used as part of a fluent-type builder pattern. Building finely tune instances should be done as follows:

```   Percentile customized = new Percentile(quantile).
withEstimationType(estimationType).
withNaNStrategy(nanStrategy).
withKthSelector(kthSelector);
```

If any of the `withXxx` method is omitted, the default value for the corresponding customization parameter will be used.

Parameters:
`newNaNStrategy` - NaN strategy for the new instance
Returns:
a new instance, with changed NaN handling strategy
Throws:
`NullArgumentException` - when newNaNStrategy is null
• #### getKthSelector

`public KthSelector getKthSelector()`
Get the `kthSelector` used for computation.
Returns:
the `kthSelector` set
• #### getPivotingStrategy

`public PivotingStrategy getPivotingStrategy()`
Get the `PivotingStrategy` used in KthSelector for computation.
Returns:
the pivoting strategy set
• #### withKthSelector

`public Percentile withKthSelector(KthSelector newKthSelector)`
Build a new instance similar to the current one except for the `kthSelector` instance specifically set.

This method is intended to be used as part of a fluent-type builder pattern. Building finely tune instances should be done as follows:

```   Percentile customized = new Percentile(quantile).
withEstimationType(estimationType).
withNaNStrategy(nanStrategy).
withKthSelector(newKthSelector);
```

If any of the `withXxx` method is omitted, the default value for the corresponding customization parameter will be used.

Parameters:
`newKthSelector` - KthSelector for the new instance
Returns:
a new instance, with changed KthSelector
Throws:
`NullArgumentException` - when newKthSelector is null

Copyright © 2016–2020 Hipparchus.org. All rights reserved.