Replacing Outliers

To identify and replace the outliers with a specific value, use binary series for a quick and efficient result. (W1 > 10.0) yields a series which is 1.0 wherever W1 is greater than 10.0, and 0.0 elsewhere. Multiply this by the desired replacement value:


 (W1 > 10.0) * -1.5


to obtain a series which is -1.5 at each replacement location and 0.0 elsewhere.


This is "close", but is only half of the solution. The remaining part of the problem is to retain those points in W1 that are not outliers (i.e. W1 <= 10.0).


 (not(W1 > 10.0) * W1)


returns a series that is the same as W1 everywhere W1 does not exceed 10.0, and 0.0 elsewhere. Adding the two expressions together produces the desired result:


 ((W1 > 10.0) * (-1.5)) + (not(W1 > 10.0) * W1))


To generalize with a macro:


#define replace(s, cond, val) (((cond)*(val)) + ((not(cond))*(s)))


where s is the input series, cond is the condition for replacement, and val is the desired replacement value.


What happens if NAVALUE is used for the replacement value? Try:


((W1 > 10.0) * (navalue)) + (not(W1 > 10.0) * W1))


Points greater than 10.0 are "dropped out" of the display of the series, but the x-axis is retained because the NAVALUE serves as a placeholder.


Finally, to replace outliers with a linear interpolation of the surrounding points, consider the following:


 delete(W1, W1 > 10.0)


returns a series of the y-values which are not outliers, i.e. W1 <= 10.0.


 delete(xvals(w1), w1 > 10.0)


returns a series of the x-values where the y-values of W1 are not outliers.


 xy(delete(xvals(w1), w1 > 10.0), delete(w1, w1 > 10.0))


creates an XY plot where the outliers are removed from the x- and y-values. DADiSP will graphically "connect the dots" in the XY plot, so that it appears to have interpolated between the points on either side of the outliers, but, no extra points have been added. To insert the interpolated points, use:


 xyinterp(delete(xvals(w1), w1 > 10.0),delete(w1, w1 > 10.0))


Finally, the OUTLIER function incorporates these ideas into one, simple outlier removal function.