The Correlation Coefficient

The degree or level of correlation is measured with the help of the correlation coefficient or coefficient of correlation. For population data, the correlation coefficient is denoted by $$\rho $$. The joint variation of $$X$$ and $$Y$$ is measured by the covariance of $$X$$ and $$Y$$. The covariance of $$X$$ and $$Y$$ denoted by $$Cov\left( {X,Y} \right)$$ is defined as:

\[Cov\left( {X,Y} \right) = E\left[ {X – E\left( X \right)} \right]\left[ {Y – E\left( Y \right)} \right]\]

The $$Cov\left( {X,Y} \right)$$ may be positive, negative or zero. The covariance has the same units in which $$X$$ and $$Y$$ are measured. When $$Cov\left( {X,Y} \right)$$ is divided by $${\sigma _X}$$ and$${\sigma _Y}$$, we get the correlation coefficient $$\rho $$. Thus $$\rho = \frac{{Cov\left( {X,Y} \right)}}{{{\sigma _X}{\sigma _Y}}}$$, $$\rho $$ is free of the units of measurement.

It is a pure number and lies between $$ – 1$$ and $$ + 1$$. If $$\rho = \pm 1$$, it is called a perfect correlation. If $$\rho = – 1$$, it is called perfect negative correlation. If there is no correlation between $$X$$ and $$Y$$, then $$X$$ and $$Y$$ are independent and $$\rho = 0$$. For sample data, the correlation coefficient denoted by “$$r$$” is a measure of strength of the linear relation between $$X$$ and $$Y$$ variables, where “$$r$$” is a pure number and lies between $$ – 1$$ and $$ + 1$$. On the other hand Karl Pearson’s coefficient of correlation is:

\[r = \frac{{\sum \left( {X – \overline X } \right)\left( {Y – \overline Y } \right)}}{{\sqrt {\sum {{\left( {X – \overline X } \right)}^2}\sum {{\left( {Y – \overline Y } \right)}^2}} }}\]