next up previous contents
Next: Properties Up: Classes Previous: Shape   Contents

Relationship

Relationship

Measures of Association

There are several Measures of Association.

Correlational Measures

There are three key Correlational Measures we will cover here.

Covariance

The Covariance is a non-standardized measure of relationship; meaning it can not be used to compare the relationship of two variables to the relationship of two other variables.

What is Covariance then?

Calculating Covariance

The definitional formula for calculating the covariance of two sample variables (X, Y) is:

\(COV_{xy} = \frac{\sum{(X - \overline{X})(Y - \overline{Y})}}{n - 1}\)
This formula is very similar to the variance formula; for instance if we swap out all the Y's in the above formula for more X's, we get the variance of X:
\(S_{X}^{2} = \frac{\sum{(X - \overline{X})(X - \overline{X})}}{n - 1} = \frac{\sum{(X - \overline{X})}^{2}}{n - 1}\)

Computational formula for Covariance

\(COV_{XY} = \frac{\sum{XY} - \frac{\sum{X}\sum{Y}}{n}}{n - 1}\)

Oil Example sample data: Covariance Calculation

\(XY_{i}\) Barrels (X) Costs (Y) \(XY\)
1 159 520 82680
2 166 570 94620
3 176 510 89760
4 185 560 103600
5 191 560 106960
6 194 530 102820
7 199 560 111440
8 207 580 120060
9 216 550 118800
10 228 560 127680
\(\sum{X} = 1921\) \(\sum{Y} = 5500\) \(\sum{XY} = 1058420\)


\(n = 10\)

Example Calculation continued

Taking the sums and \(n\) from the previous slide, we can use the computational formula to complete the calculation of covariance.

\(COV_{XY} = \frac{\sum{XY} - \frac{\sum{X}\sum{Y}}{n}}{n - 1} = \frac{1058420 - \frac{(1921)(5500)}{10}}{10 - 1}\)...
\(COV_{XY} = \frac{1058420 - 1056550}{9} = \frac{1870}{9} = 207.78\)
So, the covariance of X and Y is 207.78; which does not seem terribly meaningful.

Correlation

The Correlation (\(r\)) is a standardized measure of relationship; meaning it can be compared across multiple pairs of variables, regardless of scale.

Interpretation of Correlation

Once calculated:

The size of \(r\) indicates the strength of the relationship and the sign (positive or negative) indicates the direction of the relationship.

Calculating Correlation

Calculating correlation is quite easy, once you have the covariance.

\(r_{XY} = \frac{COV_{XY}}{S_{X}S_{Y}}\)

What does that mean?

The correlation between Barrels and Costs is 0.424...so what?

Taking Correlation a step further.

One very good way of helping yourself to interpret correlation is to square it.

Adjusted Correlation

When sample sizes are small, as they are here (\(n = 10\)), the sample correlation will tend to overestimate the population correlation.

Adjusting our example correlation

Adjusting our Oil example correlation:

\(r_{adj} = \sqrt{1 - \frac{(1 - r^2)(n - 1)}{n - 2}} = \sqrt{1 - \frac{(1 - .424^2)(10 - 1)}{10 - 2}} = ...\)


\(\sqrt{1 - \frac{(1 - .1798)(9)}{8}} = \sqrt{1 - \frac{(.8202)(9)}{8}} = ...\)


\(\sqrt{1 - \frac{7.3818}{8}} = \sqrt{1 - .9227} = ...\)


\(\sqrt{.0773} = .2780\)

Additional Considerations with Measures of Relationship

Measures of relationship tell us something about whether or not two (or more) variables share variance.
They do NOT tell us what causes the relationship!
Nor do they tell us if one variable causes another!


next up previous contents
Next: Properties Up: Classes Previous: Shape   Contents
jds0282 2010-10-04