PEARSON

Purpose:

Calculates Pearson's Linear Correlation Coefficient.

Syntax:

PEARSON(x, y)

(r, p) = PEARSON(x, y)

x

-

An input series.

y

-

An input series

Alternate Syntax:

 

PEARSON(A)

(r, p) = PEARSON(A)

A

-

A matrix.

Returns:

A real, the correlation coefficient.

 

(r, p) = PEARSON(x, y) returns the correlation coefficient and the p-value for testing the null hypothesis that any correlation is due to chance.

 

For a matrix, PEARSON(A) returns a NxN square matrix where N is the number of columns in A.

 

(r, p) = PEARSON(A) returns the correlation matrix and the p-values.

Example:

W1: gsin(100, .01, 4)

W2: gsin(100, .01, 4, pi/3)

 

pearson(W1, W2)

 

returns: 0.5

Example:

pearson(W1, W1)

 

returns: 1.0

Example:

pearson(W1, W1/2)

 

returns: 1.0

Example:

pearson(W1, -W1)

 

returns -1.0

Example:

pearson(gsin(100, 0.01, 2), gcos(100, 0.01, 2))

 

returns -1.867950E-016

Example:

W1: cumsum(gnorm(10, 1))

W2: cumsum(gnorm(10, 1))

W3: cumsum(gnorm(10, 1))

 

W4: {pearson(w1, w1), pearson(w1, w2), pearson(w1, w3)};table

W5: {pearson(w2, w1), pearson(w2, w2), pearson(w2, w3)};table

W6: {pearson(w3, w1), pearson(w3, w2), pearson(w3, w3)};table

 

W7: ravel(w4..w6);table

W8: pearson(ravel(w1..w3))

W9: w7 - w8

 

W1, W2 and W3 contain shaped random noise.

W4 computes the linear correlation between W1 and W1, W2, W3.

W5 computes the linear correlation between W2 and W1, W2, W3.

W6 computes the linear correlation between W3 and W1, W2, W3.

W7 combines the results into a 3x3 matrix.

W8 computes the correlation coefficients from a matrix where the columns are W1, W2 and W3. The result is a 3x3 matrix.

W9 is all zeros, indicating the two methods are identical.

Example:

W1: randn(100, 1);ravel(w0, 3*w0 + 10, deriv(w0))

W2: pearson(w1)

W3: (r, p) = pearson(w1);p

 

Since column 2 is a linear transformation of column 1, we expect the correlation values W2[2, 1] and W2[1, 2] to be near unity. We expect the p-values W3[2, 1] and W3[1, 2] to be small, indicating the null hypothesis is false that any correlation of column 1 and column 2 is due to chance.

 

Even though column 3 is dependent on column 1, it is not linearly dependent, so the other off-diagonal members of W2 will be small and the other off diagonal members of W3 will be larger than zero indicating low correlation that is not significant.

Remarks:

Pearson’s correlation coefficient for a population is defined as the covariance of two variables divided by the product of their standard deviations:

 

image\pearson01.gif

 

By substituting the sample estimates of the covariance and standard deviations, the sample correlation coefficient computed by PEARSON becomes:

 

image\pearson02.gif

 

where the arithmetic mean is defined as:

 

image\pearson03.gif

 

PEARSON returns the degree of linear correlation between the two input series. The correlation values range from -1 to +1, where -1 indicates perfect negative linear correlation, +1 indicates perfect positive linear correlation and 0 indicates no linear correlation.

 

The p-values test the null hypothesis that any correlation between the columns is due to chance. For example, if a particular p-value is smaller than a chosen significance level (e.g. 0.05 for a 95% confidence level), then the corresponding column correlation is considered significant and unlikely due to chance.

 

The p-values range from 0 to 1 where 0 indicates very strong evidence against the null hypothesis and 1 indicates no evidence against the null hypothesis.

 

PEARSON assumes x and y have the same number of points.

 

For a matrix, PEARSON returns a NxN square matrix where N is the number of columns in matrix A.

 

See CORRCOEF for a fast computation of the correlation matrix.

 

See LINREG2 to fit a line to X and Y values using the method of least squares.

See Also:

AUTOCOR

COLPAIRWISE

CORRCOEF

COVM

CROSSCOR

LINFIT

LINREG2

PFIT

TREND