Up: RSS Introduction to Statistics Home
Module 7: Additions to Significance Testing
Effect Size
From a score to a distribution of scores
Effect Size
- Keep in mind, there are two types of effect sizes:
- Measures of Difference
- Allows comparison across samples and variables with differing variance.
- Equivalent to Z-scores
- Note sometimes there is no need to standardize (units of the scale have inherent meaning).
- Measures of Variance Accounted for.
- Amount of explained variance vs. total variance.
- Such as and
- For now, we will deal with Measures of Difference.
Effect Size
- Effect size is a standardized measure of difference (lack of overlap) between populations.
- Effect size is the magnitude of experimental effect.
- Effect size:
- Increases with greater differences between means,
- Decreases with greater standard deviations in the population but,
- Is not affected by sample size.
Calculating Effect Size
- There are many measures of effect size, for now we will be using Cohen's d.
- Notice within this formula, we are removing the influence of population standard deviation.
- This produces the standardized effect size.
- Raw score effect size (i.e., without dividing by ) is virtually useless.
- The standardization allows us to compare effect sizes obtained from different research studies.
Remember Scooby...?
- Population 1: Dogs on cartoons.
- Sample: Scooby, Pluto, and Goofy (
).
- Population 2: Dogs not on cartoons (
)
- Please note the effect size is greater than 1. This may not always be the case, but the value of Cohen's d can be greater than 1.
Remember Scooby...part 2?
- Population 1: Dogs on cartoons.
- Sample: Scooby, Underdog, and Scrappy (
)1.
- Population 2: Dogs not on cartoons (
)
- The effect size is not greater than 1, but this may still be considered a large effect size.
Interpreting Cohen's d
- One way: Effect size conventions suggested from Cohen.
- Small = 0.20
- Medium = 0.50
- Large = 0.80 and greater
- A better way: Rational judgment based on a thorough understanding of the phenomena and the previous literature.
- It may be that an effect size of 0.90 is small based on previous findings where to 1.90.
Statistical Power
Statistical Power
- Definition: The probability that the study will produce a statistically significant result if the null hypothesis is false.
- The ability to detect a significant effect is one is present.
- Important to note: `if the null hypothesis is false'
- If you get a significant result when the null is true, then you have committed a Type I error.
- General equation for power:
- Power = 1 - beta
- Power = 1 -
Two kinds of Power analysis
- A priori Power
- Used when planning a study
- Used to determine the sample size necessary to achieve a specified power level.
- Post hoc Power
- Used when evaluating a study.
- What chance did a study have of finding significant results?
- Not really useful. If you do the power analysis and conduct your study accordingly, then you did what you could.
- To say afterward: ``I would have found significance but did not have enough power or enough participants is not going to impress anyone''.
A priori Power
Can use all the following to calculate how many subjects / participants we need for our study.
- Decide an acceptable level of power.
- Set the significance level (usually .05).
- Figure out the desirable or expected effect size.
- Calculate n needed to achieve significance with those levels of power and effect size.
A priori Effect Size?
- Figure out an effect size before I conduct my study?
- Several ways to do this:
- Base it on substantive knowledge.
- What you know about the situation and scale of measurement.
- Base it on previous literature / research.
- Use Cohen's conventions (not recommended).
An acceptable level of power?
Why not set power at .99?
- Practicalities.
- Cost of increasing power (usually done by increasing sample size) can be high.
- Increasing power decreases the Type II error rate (good), but also increases Type I error rate (bad).
- Power has a range of 0 to 1 (it is a probability); with a higher number indicating greater power.
Influences on Power
Table 1: Influences on Power
Feature of Study |
High Power |
Low Power |
Effect Size |
larger |
smaller |
Sample Size |
larger |
smaller |
Sig. Level |
high (.10) |
low (.001) |
Tailed Test |
1-tailed |
2-tailed |
Type of analysis |
varies |
varies |
Carrying out the calculation of Power
The easiest way.
- When you have to implement power calculations, you can use specialist programs.
- Many websites offer free applications to conduct power analysis.
- G-power:
Calculating Power
The more difficult way.
- First, convert your critical value (
) into a raw score.
- This defines the point on your Null Distribution where the rejection region begins.
Null Distribution
Calculating Power continued
- Next, calculate the Z-score for a raw score of 114.22 on the Alternative Distribution.
- Finally, look in the Z-score table to identify beta and power.
Alternative Distribution
Practical Significance
Statistical vs. Practical Significance
- Statistical significance is determined by a dichotomous decision based on the p value.
- If ; then reject the null hypothesis.
- If ; then fail to reject the null hypothesis.
- Practical significance has more to do with the effect size and meaningfulness of the results in practical terms.
- If , reject the null, but if ; then your results are not likely to be influential or useful.
More on Practical Significance
- Keep in mind, anything will be significant with a large enough sample!!!
- However, the results may not be meaningful or useful.
- Remember Scooby and Friends...
- Example 1:
; reject the null because
- Example 2 (from Module 6 handout):
; fail to reject the null because
- Hypothetically, you could get a result like this:
Concluding Thoughts
- Always report as much information as you can; meaning:
- The calculated sample statistic
- The sample size
- The critical level (.05)
- The obtained value (
- The effect size ()
- The power
- If it was used a-priori to calculate sample size and the appropriate sample size was obtained (G-power application).
- Remember, values are not related to effect sizes.
- Use a-priori power and effect size to determine the minimum sample size (and gather that amount of data) prior to collecting the data.
- Post hoc power is virtually meaningless.
Summary of Module 7
Summary of Module 7
Module 7 covered the following topics:
- Cohen's d effect size.
- Statistical Power.
- Practical significance.
Many of these topics will be revisited consistently in future modules.
This concludes Module 7
Next time Module 8.
- Next time we'll begin covering Introduction to t tests.
- Until next time; have a nice day.
This page was last updated on: October 12, 2010
This page was created using LATEX. This document was created in LATEX and converted to HTML using LATEX2HTML.
Return to the Short Course page by clicking the link below.