PERCENTILE

Purpose:

Estimates the percentile of a series.

Syntax:

PERCENTILE(series, p, method)

series

-

A series. The input population.

p

-

A real or series. The percentile as a decimal fraction where 0 <= p <= 1.

method

-

Optional. An integer, the percentile estimation method. Defaults to 5, averaged empirical distribution function. For an input series of length N, the following methods are available:

1:

Weighted Average at Np

2:

Closest Observation

3:

Empirical Distribution Function

4:

Weighted Average at (N+1)p

5:

Empirical Distribution Function with Averaging (default)

6:

Empirical Distribution Function with Interpolation

Returns:

An XY series or scalar, the estimated percentiles.

 

(y, p) = percentile(series, p, method) returns the percentile values and decimal percentages as separate values.

Example:

p1 = percentile({1, 2, 3, 4, 5}, 0.9)

p2 = percentile({1, 2, 3, 4, 5}, 0.9, 6)

 

p1 == 5

p2 == 4.6

 

The 90% percentile is calculated for the series using two methods.

 

p1 is calculated using the default Empirical Distribution Function with Averaging method. This is the default method used by SAS and other statistical packages.

 

p2 is calculated using the Empirical Distribution Function with Interpolation method, the method used by Excel.

Example:

W1: gnorm(1000,1)

W2: percentile(w1, 0.01..0.01..0.99)

 

W2 is a 99 point XY series that estimates the percentile values of W1 from 1% to 99% in steps of 1%.

Remarks:

The percentile function provides estimates of proportions of the data that should fall above and below a given value. Given p as a decimal fraction between 0 and 1, the pth percentile is a value such that at most (100p)% of the observations are less than this value and that at most 100(1 - p)% are greater.

 

Thus:

 

The 1st percentile cuts off lowest 1% of data.

 

The 98th percentile cuts off lowest 98% of data.

 

The 25th percentile is the first quartile.

 

The 50th percentile is the median.

 

The method parameter specifies the procedure to compute percentiles. Let N be the number of non-missing values for a variable, and let x1, x2, ..., xN represent the ordered values of the variable such that x1 is the smallest value and xN is the largest value. For p the percentile as a decimal fraction between 0 and 1, the result y is computed for each method as follows:

 

Method 1, Weighted Average at x[Np]

 

image\pcnt02.gif

 

where j is the integer part of Np and g the fractional part.

 

 

Method 2, Closest Value to (N-1)p+1

 

image\pcnt03.gif

 

where j is the integer part of (N-1)p+1 and g the fractional part.

 

 

Method 3, Empirical Distribution Function

 

image\pcnt04.gif

 

where j is the integer part of Np and g the fractional part.

 

 

Method 4, Weighted Average at x[(N+1)p]

 

image\pcnt05.gif

 

where j is the integer part of (N+1)p and g the fractional part.

 

 

Method 5, Empirical Distribution Function with Averaging (default)

 

image\pcnt06.gif

 

where j is the integer part of Np and g the fractional part.

 

 

Method 6, Empirical Distribution Function with Interpolation

 

image\pcnt07.gif

 

where j is the integer part of (N-1)p+1 and g the fractional part.

 

 

The default method is 5, Empirical Distribution Function with Averaging.

 

The method parameters follow the format available with SAS.

 

Excel and S-Plus use method 6.

 

Minitab and SPSS use method 4.

 

The percentile p is specified as a decimal fraction such that 0 <= p <= 1.0 where 1.0 represents 100 percent.

 

If p is a series, PERCENTILE returns an XY series where the X values are the input decimal percentages. Use

 

(y, p) = percentile(s, p, method) 

 

to return the percentile estimate y as a separate interval series.

See Also:

HISTOGRAM

INVPROBN

MEAN

MEDIAN

PROBN

PDFNORM

XCONF