Post-hoc

Next: Effect Size Up: NHST Previous: Comparisons Contents

Post-hoc Testing

In exploratory research, you would test all possible combinations of groups to determine where the significant differences are located (typically only if the omnibus is significant).
- Most researchers always do either planned comparisons (preferred) or post-hoc testing (exploratory), because the omnibus $F$ is not terribly informative on its own.
There are many types of post-hoc procedures, each has strengths and weaknesses ... but, some are better than others.
Some suggestions:
- If assumptions met: Tukey's HSD or REGW-Q.
- If unequal group sizes: Games-Howell.
- If unequal variances: Games-Howell.

More on $q$

When using we first must rank the means from smallest to largest.
- For our example: $\left\{11.50, 17.50, 22.75\right\}$
There are 3 means in this set, $r = 3$ which is used in looking up values in the $q$ table (the table below uses the familar $k$ instead of $r$ but, most use $r$ ).
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
Using the formula from above, we can solve for $q$
$q = \frac{\overline{X}_{l} - \overline{X}_{s}}{\sqrt{\frac{MS_{w}}{n_{g}}}} = \frac{22.75 - 11.50}{\sqrt{\frac{2.083}{4}}} = 15.58$
So and we can use , significance level of 0.05, and to find in the table.
- Given , we reject the null hypothesis and conclude there was a significant difference between the largest and smallest means.
  - Which can replace the omnibus $F$ , but with less statistical power.

Tukey's Honestly Significant Differences test

Tukey's HSD is considered one of the original post-hoc tests and offers a fixed family-wise error rate at alpha (significance level).
Once we have ranked the means $\left\{11.50, 17.50, 22.75\right\}$ and found our critical value ( $q_{crit} = 3.95$ ) we can calculate the minimum difference between means needed to discover a significant difference.
$\overline{X}_{i} - \overline{X}_{j} = q_{crit} \sqrt{\frac{MS_{w}}{n_{g}}} = 3.95\sqrt{\frac{2.083}{4}} = 2.85$
So, if a mean difference ( $\overline{X}_{i} - \overline{X}_{j}$ ) is larger than 2.85 we would conclude there is a significant difference between those two means.

Tukey's HSD continued

With our minimum significant difference calculated at 2.85; we can compare the differences among our three means to it, to determine which means differ significantly from the others.

	Green	Blue	Red
$q_{r,.05} = 2.85$	11.50	17.50	22.75
Green = 11.50	0	6.00	11.25
Blue = 17.50	-	0	5.25
Red = 22.75	-	-	0

So, we find a significant difference between each pair of means because, each difference is greater than 2.85.

Games-Howell Post-hoc Test

The Tukey test above assumes equal sample sizes for each group and equal variances among the groups.
- Often `real data' do not conform to what we would expect and these assumptions are often violated.
The Games-Howell test provides a method for when either (or both) of the assumptions are not upheld.
- Essentially, this test incorporates the samples sizes and variances of each group being compared.
The $df$ and the critical difference between means ( $\overline{X}_{i} - \overline{X}_{j}$ ) are modified for inclusion of samples sizes and variances.

Games-Howell modification of $df$

Until now we have consistently used $df_{w}$ for finding our critical values (e.g., $q_{crit}$ ).
With the Games-Howell test, we actually calculate $df'$ using each groups' sample size and variance; such that for each pair of groups:
$df' = \frac{\left(\frac{S_{i}^2}{n_{i}}+\frac{S_{j}^2}{n_{j}}\right)^2}{\frac{... ...}\right)^2}{n_{i} - 1}+\frac{\left(\frac{S_{j}^2}{n_{j}}\right)^2}{n_{j} - 1}}$
where subscript $i$ and subscript $j$ identify descriptive statistics from each group being compared.

Example $df'$

As an example let's consider the Red and Green groups.

Red (i): $\overline{X} = 22.75, S^2 = 2.92, n = 4$
Green (j): $\overline{X} = 11.50, S^2 = 1.66, n = 4$

$df' = \frac{\left(\frac{S_{i}^2}{n_{i}}+\frac{S_{j}^2}{n_{j}}\right)^2}{\frac{... ....92}{4}\right)^2}{4 - 1}+\frac{\left(\frac{1.66}{4}\right)^2}{4 - 1}} = 5.5787$

Pay careful attention to the fact that is a symbol, not an operation.
- See the supplemental handout for an example of the complete, step-by-step calculation.

Games-Howell difference between means

Minimum Significant Difference

So, given $r = 3$ , significance level of 0.05, and now $df' = 5.787$ we look to the $q$ table and find: $q_{crit} \approx 4.60$
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
Where earlier we had:
$\overline{X}_{i} - \overline{X}_{j} = q_{crit} \sqrt{\frac{MS_{w}}{n_{g}}}$
Now, for each pair of means, we have:
$\overline{X}_{i} - \overline{X}_{j} = q_{crit} \sqrt{\frac{\frac{S_{i}^2}{n_{i}}+\frac{S_{j}^2}{n_{j}}}{2}}$
For the current Red vs. Green example:
$\overline{X}_{i} - \overline{X}_{j} = 5.787 \sqrt{\frac{\frac{2.92}{4}+\frac{1.66}{4}}{2}} = 4.379$

Red $\overline{X} = 22.75$ and Green $\overline{X} = 11.50$

Given our minimum significant difference (for this pair) of 4.379, we need an actual mean difference larger than this to conclude a significant difference was present.
Clearly, $\overline{X}_{i} - \overline{X}_{j} = 22.75 - 11.50 = 11.25$ is greater than 4.379; which indicates the Red group recalled significantly more words than the Green group.
Just remember, all that calculating must be done for each group because each group may have a different sample size and more than likely will have a different variance ( $S^2$ ).
The ANOVA and associated tests of individual means (planned comparisons or post-hoc tests) are vulnerable to violations of the homogeneity assumption.
- Therefore, the Games-Howell test is highly recommended.

Next: Effect Size Up: NHST Previous: Comparisons Contents

jds0282 2010-10-21