1. Mixed Effects
Models
Mixed effects models
refer to a variety of models which have as a key feature both fixed and
random effects.
The distinction between
fixed and random effects is a murky one. As pointed out by
Gelman
(2005), there are several, often conflicting, definitions of
fixed effects as well as definitions of random effects. Gelman offers a
fairly intuitive solution in the form of renaming fixed effects and
random effects and providing his own clear definitions of each. “We
define effects (or coefficients) in a multilevel model as constant
if they are identical for all groups in a population and varying
if they are allowed to differ from group to group” (Gelman, p. 21).
Other ways of thinking about fixed and random effects, which may be
useful but are not always consistent with one another or those given by
Gelman above, are discussed in the next paragraph.
Fixed
effects are ones in which the possible values of the variable are
fixed. Random effects refer to variables in which the set of potential
outcomes can change.
Stated in terms of populations, fixed effects
can be thought of as effects for which the population elements are
fixed. Cases or individuals do not move into or out of the population.
Random effects can be thought of as effects for which the population
elements are changing or can change (i.e. random variable). Cases or
individuals can and do move into and out of the population. Another way
of thinking about the distinction between fixed and random effects is
at the observation level. Fixed effects assume scores or observations
are independent while random effects assume some
type of relationship exists between some scores or
observations. For instance, it can be said that gender is a fixed
effect variable because we know all the values of that variable (male
& female) and those values are independent of one another
(mutually exclusive); and they (typically) do not change. A variable
such as high school class has random effects because we can only sample
some of the classes which exist; not to mention, students move into and
out of those classes each year.
There are many types of
random effects, such as repeated measures of the same individuals;
where the scores at each time of measure constitute samples from the
same participants among a virtually infinite (and possibly random)
number of times of measure from those participants. Another example of
a random effect can be seen in nested designs, where for example;
achievement scores of students are nested within classes and those
classes are nested within schools. That would be an example of a
hierarchical design structure with a random effect for scores nested
within classes and a second random effect for classes nested within
schools. The nested data structure assumes a relationship among groups
such that members of a class are thought to be similar to others in
their class in such a way as to distinguish them from members of other
classes and members of a school are thought to be similar to others in
their school in such a way as to distinguish them from members of other
schools. The example used below deals with a similar design which
focuses on multiple fixed effects and a single nested random effect.
2. Linear Mixed
Effects Models
Linear mixed effects
models simply model the fixed and random effects as having a linear
form. Similar to the General Linear Model, an outcome variable is
contributed to by additive fixed and random effects (as well as an
error term). Using the familiar notation, the linear mixed effect model
takes the form:
yij = β1x1ij
+ β2x2ij
… βnxnij
+
bi1z1ij + bi2z2ij
… binznij
+ εij
where yij
is the value of the outcome variable for a particular ij case, β1
through βn are the fixed effect coefficients
(like regression coefficients), x1ij
through xnij are the fixed
effect variables (predictors) for observation j in group i (usually the
first is reserved for the intercept/constant; x1ij
= 1), bi1 through
bin are the
random effect coefficients which are assumed to be multivariate
normally distributed, z1ij through znij
are the random effect variables (predictors), and εij
is the error for case j in group i where each group’s error is assumed
to be multivariate normally distributed.
3. Example Data
The example used for this
tutorial is fictional data where the interval scaled outcome variable
Extroversion (extro) is predicted by fixed effects for the interval
scaled predictor Openness to new experiences (open), the interval
scaled predictor Agreeableness (agree), the interval scaled predictor
Social engagement (social), and the nominal scaled predictor Class
(classRC); as well as the random (nested) effect of Class (classRC)
within School (schoolRC) as well as the random effect of School
(schoolRC). The data contains 1200 cases evenly distributed among 24
nested groups (4 classes within 6 schools). The data set is available
here.
4. Running
the Analysis
First, import the data (linked above) with the
member name LMM_dataRC
Then, take a look at the data running a few common
summary PROCs; starting with a PROC MEANS for the whole data.
PROC MEANS
DATA=LMM_dataRC;
RUN;
Next, take a look at the categorical variable
frequencies.
PROC FREQ
DATA=LMM_dataRC;
TABLES
classRC schoolRC;
RUN;
Now we can proceed to fit the model. Pay
particular attention to how the model is specified. The first line of
syntax simply tells SAS what procedure we are doing (mixed) and
specifies the data. The CLASS statement lists the classification
variables (categorical variables or factors). The MODEL statement
specifies the dependent variable and then the fixed effects variables.
The RANDOM statement lists the random effects variables; classRC nested
within schoolRC as well as schoolRC alone. Note, traditional
interactions can be specified in the model as either fixed or random
effects, using the *. For example, if we wanted to include a fixed
effect interaction in our model for the interaction between open and
agree, we would add the following term to the MODEL statement:
open*agree. Also notice the /solution option is specified for both the
fixed effects and random effects. The solution option for the fixed
effects, provides the beta coefficient for each predictor (and
associated standard errors, degrees of freedom, t-value, and p-values).
The solution option for the random effects, provides the estimated mean
values for each category of each random predictor (and associated
standard errors, degrees of freedom, t-value, and p-values). The
outpred option creates a new data file with the predicted values (based
on the model); which can often be useful or meaningful.
PROC MIXED
DATA =
LMM_dataRC;
CLASS class
school;
MODEL extro
= open agree social class/SOLUTION
OUTPRED=predicted;
RANDOM
classRC(schoolRC) schoolRC/SOLUTION;
RUN;
The output should be similar to what is displayed
below. The first page of which displays the Model Information, the
levels of each classification variable (categorical variables or
factors), the number of dimensions of the model, the number of
observations, and the iteration history.
The second page of the output starts with the
Random Effects estimates. These are variance estimates (with standard
errors, Wald Z test statistics, significance values, and confidence
intervals for the variance estimates). Recall the ubiquitous ANOVA
summary table where we generally have a total variance estimate (sums
of squares) at the bottom, then just above it we have a residual or
within groups variance estimate (sums of squares) and then we have each
treatment or between groups variance estimate (sums of squares). This
table is very much like that, but the total is not displayed and the
residual variance estimate is on top. So, we can quickly calculate the
total variance estimate: 95.1431 + 2.8831 + .9684 = 98.9946 then we can
create an R? type of effect
size to gauge the importance of each random effect by dividing the
effect's variance estimate by the total variance estimate to arrive at
a proportion of variance explained or accounted for by each random
effect. This is analogous to an Eta-squared (η?) in standard ANOVA or
an R? in regression; it is sometimes referred to (in the linear mixed
effects situation) as an Intraclass Correlation Coefficient (ICC,
Bartko, 1976; Bliese, 2009). For example, we find that the nested
effect of classRC within schoolRC is 2.8831 / 98.9946 = 0.0291238 or
simply stated, that random nested effect only accounts for 2.9% of the
variance of the random effects. However, the random effect for schoolRC
alone accounts for 95.1431 / 98.9946 = 0.9610938 or 96% of the variance
of the random effects. If none of the random effects account for a
meaningful amount of variance of the random effects (i.e. if residual
is larger than the other variance estimates), then the random effects
should be eliminated from the model and a standard General Linear Model
(or Generalized Linear Model) should be fitted (i.e., a model with only
the fixed effects).
The next
part of the second page contains the fit indices; generally I use and
recommend the Bayesian Information Criterion (BIC). The next part of
the output displays the Fixed Effects estimates.
It should be clear, this table and its
interpretation are exactly like one would expect from a traditional
ordinary least squares linear regression. One thing to note is the way
SPSS chooses the reference category for categorical variables. You may
have noticed we have been using the classRC and schoolRC variables
instead of the original class and school variables in the data set. The
RC variables contain the same information as the original variables,
they simply have been ReCoded or Reverse Coded so that the output here
will match the output produced using the lme4 package in the R
programming language. It is important to know that SAS (and SPSS)
automatically choose the category with the highest numerical value (or
the lowest alphabetical letter) as the reference category for
categorical variables. All packages I have used in the R programming
language choose the reference category in the more intuitive but
opposite way. In the lme4 package (and others I've used) in R, the
software automatically picks the lowest numerical value (or the
earliest alphabetically letter) as the reference category for
categorical variables. This has drastic implications for the intercept
estimate and more troubling, the predicted values produced by a model.
For example, if this same model is specified with the original
variables (not reverse coded) then the Fixed Effects intercept term is
63.049612; so you can imagine how much different the predicted values
would be in that model compared to this model where the intercept is
57.3839. Recall from multiple regression, the intercept
is
interpreted as the mean of the outcome (extro) when all the predictors
have a value of zero. The predictor estimates (coefficients or slopes)
are interpreted the same way as the coefficients from a traditional
regression. For instance, a one unit increase in the predictor Openness
to new experiences (open) corresponds to a 0.006130 increase in the
outcome Extroversion (extro). Likewise, a one unit increase in the
predictor Agreeableness (agree) corresponds to a 0.00774 decrease
in the outcome Extroversion (extro). Furthermore,
the categorical predictor classRC = 3 has a coefficient of 2.0548;
which means, the mean Extroversion score of the third group of classRC
(3) is 2.0548 higher than the mean Extroversion score of the last group
of classRC (4). ClassRC (4) was automatically coded as the reference
category. The last part of the output (which continues on to the next
page) contains the Random Effects Estimates.
The next page of output continues displaying the
Random Effects estimates and also shows the tests of the Fixed Effects.
As with most of the tutorials / pages within this
site, this page should not be considered an exhaustive review of the
topic covered and it should not be considered a substitute for a good
textbook.
References / Resources
Akaike, H.
(1974). A new look at the statistical model identification.
I.E.E.E.
Transactions on Automatic Control, AC 19, 716 – 723.
Available at:
Bartko, J.
J. (1976). On various intraclass correlation reliability coefficients.
Psychological Bulletin, 83, 762-765.
Bates, D.,
& Maechler, M. (2010). Package ‘lme4’. Reference manual for the
package, available at:
http://cran.r-project.org/web/packages/lme4/lme4.pdf
Bates, D. (2010). Linear
mixed model implementation in lme4. Package lme4 vignette, available
at:
http://cran.r-project.org/web/packages/lme4/vignettes/Implementation.pdf
Bates, D. (2010).
Computational methods for mixed models. Package lme4 vignette,
available at:
http://cran.r-project.org/web/packages/lme4/vignettes/Theory.pdf
Bates, D. (2010).
Penalized least squares versus generalized least squares
representations of linear mixed models. Package lme4 vignette,
available at:
http://cran.r-project.org/web/packages/lme4/vignettes/PLSvGLS.pdf
Bliese, P. (2009).
Multilevel modeling in R: A brief introduction to R, the multilevel
package and the nlme package. Available at:
http://cran.r-project.org/doc/contrib/Bliese_Multilevel.pdf
Draper, D. (1995).
Inference and hierarchical modeling in the social sciences. Journal
of Educational and Behavioral Statistics, 20(2), 115 - 147.
Available at:
Fox, J.
(2002). Linear mixed models: An appendix to “An R and S-PLUS companion
to applied regression”. Available at:
http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf
Gelman, A.
(2005). Analysis of variance -- why it is more important than ever. The
Annals of Statistics, 33(1), 1 -- 53. Available at:
http://bayes.acs.unt.edu:8083:8083/BayesContent/class/Jon/MiscDocs/Gelman_2005.pdf
Hofmann, D.
A., Griffin, M. A., & Gavin, M. B. (2000). The application of
hierarchical linear modeling to organizational research. In K. J. Klein
(Ed.), Multilevel theory, research, and methods in
organizations: Foundations, extensions, and new directions (p.
467 - 511). San Francisco, CA: Jossey-Bass. Available at:
Littell, R.
C., Milliken, G. A., Stroup, W. W., & Wolfinger, R. D. (1996).
SAS system for mixed models. Cary, NC: SAS Institute Inc.
Raudenbush,
S. W. (1995). Reexamining, reaffirming, and improving application of
hierarchical models. Journal of Educational and Behavioral
Statistics, 20(2), 210 - 220. Available at:
Raudenbush,
S. W. (1993). Hierarchical linear models and experimental design. In L.
Edwards (Ed.), Applied analysis of variance in behavioral
science (p. 459 - 496). New York: Marcel Dekker. Available at:
Rogosa, D.,
& Saner, H. (1995). Longitudinal data analysis examples with
random coefficient models. Journal of Educational and
Behavioral Statistics, 20(2), 149 - 170. Available at:
Schwarz, G.
(1978). Estimating the dimension of a model.
Annals
of Statistics, 6, 461 – 464. Available at:
Return to the
SAS
Short Course
|