CORRCOEF

Purpose:

Calculates the correlation matrix.

Syntax:

CORRCOEF(a1, a2)

(r, p) = CORRCOEF(a1, a2)

a1

-

A series or array.

a2

-

An optional second series or array.

Returns:

A square matrix, the correlation matrix.

 

(r, p) = CORRCOEF(a) returns the correlation matrix and the p-values for testing the null hypothesis that any correlation is due to chance.

Example:

a = {{1, 5, 8},

     {3, 4, 9},

     {2, 6, 7}};

 

b = corrcoef(a);

 

b == {{ 1.0, -0.5,  0.5},

      {-0.5,  1.0, -1.0},

      { 0.5, -1.0,  1.0}}

 

Computes the correlation matrix where b[j, k] represents the correlation between columns j and k. Diagonal values are always 1 since each column correlates perfectly with itself.

Example:

a = {{1, 5, 8},

     {3, 4, 9},

     {2, 6, 7}};

 

b = corrcoef(a);

c = pearson(a);

 

c == b

 

For a single array input, the correlation matrix is the same as Pearson's Linear Correlation Coefficient.

Example:

W1: randn(100, 1);ravel(w0, 3*w0 + 10, deriv(w0))

W2: corrcoef(w1)

W3: (r, p) = corrcoef(w1);p

 

Because column 2 is a linear transformation of column 1, the correlation coefficients W2[2,1] and W2[1,2] are expected to be close to unity. Likewise, the corresponding p-values W3[2,1] and W3[1,2] should be small, providing strong evidence against the null hypothesis that the observed correlation between columns 1 and 2 is due to chance

 

Even though column 3 is dependent on column 1, it is not linearly dependent, so the other off-diagonal members of W2 will be small and the other off-diagonal members of W3 will be larger than zero, indicating low correlation that is not significant.

Remarks:

CORRCOEFF computes the linear correlation coefficients between each pair of columns of an array. See PEARSON for a further discussion of the correlation matrix.

 

The mean is removed from each column before the correlation is computed.

 

The correlation values range from -1 to +1, where -1 indicates perfect negative linear correlation, +1 indicates perfect positive linear correlation and 0 indicates no linear correlation.

 

P-values range from 0 to 1 and quantify the strength of evidence against the null hypothesis. In correlation analysis, the null hypothesis asserts the true correlation is zero, meaning any observed correlation is due to chance. A p-value near 0 indicates strong evidence against the null hypothesis, suggesting the observed correlation is unlikely to be due to chance. A p-value near 1 implies weak or no evidence against the null, consistent with random noise. The p-value represents the probability of observing a correlation as extreme as (or more extreme than) the one calculated, assuming the null hypothesis is true. For example, if the p-value is below a chosen significance threshold (commonly 0.05 for a 95% confidence level), the correlation is considered statistically significant, i.e. it is unlikely to have occurred by chance alone.

 

The standard deviations of each column of matrix a can be calculated by:

 

diag(sqrt(corrcoef(a)))

 

CORRCOEF produces the same result as PEARSON for a single array input but the computation is much faster due to array optimization.

 

For corrcoef(A, B) the result is identical to corrcoef(unravel(A), unravel(B)).

 

See COVM to compute the covariance matrix.

See Also:

*^ (Matrix Multiply)

COLPAIRWISE

COLSTDEV

COVM

PEARSON