DSA SAS Short Course: Module 9.2 Linear Mixed Effects Modeling

Data Science and Analytics

Please participate in the DSA Client Feedback Survey.

MODULE 9.1

1. Mixed Effects Models

Mixed effects models refer to a variety of models which have as a key feature both fixed and random effects.

The distinction between fixed and random effects is a murky one. As pointed out by Gelman (2005), there are several, often conflicting, definitions of fixed effects as well as definitions of random effects. Gelman offers a fairly intuitive solution in the form of renaming fixed effects and random effects and providing his own clear definitions of each. “We define effects (or coefficients) in a multilevel model as constant if they are identical for all groups in a population and varying if they are allowed to differ from group to group” (Gelman, p. 21). Other ways of thinking about fixed and random effects, which may be useful but are not always consistent with one another or those given by Gelman above, are discussed in the next paragraph.

Fixed effects are ones in which the possible values of the variable are fixed. Random effects refer to variables in which the set of potential outcomes can change. Stated in terms of populations, fixed effects can be thought of as effects for which the population elements are fixed. Cases or individuals do not move into or out of the population. Random effects can be thought of as effects for which the population elements are changing or can change (i.e. random variable). Cases or individuals can and do move into and out of the population. Another way of thinking about the distinction between fixed and random effects is at the observation level. Fixed effects assume scores or observations are independent while random effects assume some type of relationship exists between some scores or observations. For instance, it can be said that gender is a fixed effect variable because we know all the values of that variable (male & female) and those values are independent of one another (mutually exclusive); and they (typically) do not change. A variable such as high school class has random effects because we can only sample some of the classes which exist; not to mention, students move into and out of those classes each year.

There are many types of random effects, such as repeated measures of the same individuals; where the scores at each time of measure constitute samples from the same participants among a virtually infinite (and possibly random) number of times of measure from those participants. Another example of a random effect can be seen in nested designs, where for example; achievement scores of students are nested within classes and those classes are nested within schools. That would be an example of a hierarchical design structure with a random effect for scores nested within classes and a second random effect for classes nested within schools. The nested data structure assumes a relationship among groups such that members of a class are thought to be similar to others in their class in such a way as to distinguish them from members of other classes and members of a school are thought to be similar to others in their school in such a way as to distinguish them from members of other schools. The example used below deals with a similar design which focuses on multiple fixed effects and a single nested random effect.

2. Linear Mixed Effects Models

Linear mixed effects models simply model the fixed and random effects as having a linear form. Similar to the General Linear Model, an outcome variable is contributed to by additive fixed and random effects (as well as an error term). Using the familiar notation, the linear mixed effect model takes the form:

y_ij = β₁x_1ij + β₂x_2ij … β_nx_nij + b_i1z_1ij + b_i2z_2ij … b_inz_nij + ε_ij

where y_ij is the value of the outcome variable for a particular ij case, β₁ through β_n are the fixed effect coefficients (like regression coefficients), x_1ij through x_nij are the fixed effect variables (predictors) for observation j in group i (usually the first is reserved for the intercept/constant; x_1ij = 1), b_i1 through b_in are the random effect coefficients which are assumed to be multivariate normally distributed, z_1ij through z_nij are the random effect variables (predictors), and ε_ij is the error for case j in group i where each group’s error is assumed to be multivariate normally distributed.

3. Example Data

The example used for this tutorial is fictional data where the interval scaled outcome variable Extroversion (extro) is predicted by fixed effects for the interval scaled predictor Openness to new experiences (open), the interval scaled predictor Agreeableness (agree), the interval scaled predictor Social engagement (social), and the nominal scaled predictor Class (classRC); as well as the random (nested) effect of Class (classRC) within School (schoolRC) as well as the random effect of School (schoolRC). The data contains 1200 cases evenly distributed among 24 nested groups (4 classes within 6 schools). The data set is available here.

4. Running the Analysis

First, import the data (linked above) with the member name LMM_dataRC

Then, take a look at the data running a few common summary PROCs; starting with a PROC MEANS for the whole data.

PROC MEANS DATA=LMM_dataRC;
RUN;

Next, take a look at the categorical variable frequencies.

PROC FREQ DATA=LMM_dataRC;
TABLES classRC schoolRC;
RUN;

Now we can proceed to fit the model. Pay particular attention to how the model is specified. The first line of syntax simply tells SAS what procedure we are doing (mixed) and specifies the data. The CLASS statement lists the classification variables (categorical variables or factors). The MODEL statement specifies the dependent variable and then the fixed effects variables. The RANDOM statement lists the random effects variables; classRC nested within schoolRC as well as schoolRC alone. Note, traditional interactions can be specified in the model as either fixed or random effects, using the *. For example, if we wanted to include a fixed effect interaction in our model for the interaction between open and agree, we would add the following term to the MODEL statement: open*agree. Also notice the /solution option is specified for both the fixed effects and random effects. The solution option for the fixed effects, provides the beta coefficient for each predictor (and associated standard errors, degrees of freedom, t-value, and p-values). The solution option for the random effects, provides the estimated mean values for each category of each random predictor (and associated standard errors, degrees of freedom, t-value, and p-values). The outpred option creates a new data file with the predicted values (based on the model); which can often be useful or meaningful.

PROC MIXED DATA = LMM_dataRC;
CLASS class school;
MODEL extro = open agree social class/SOLUTION OUTPRED=predicted;
RANDOM classRC(schoolRC) schoolRC/SOLUTION;
RUN;

The output should be similar to what is displayed below. The first page of which displays the Model Information, the levels of each classification variable (categorical variables or factors), the number of dimensions of the model, the number of observations, and the iteration history.

The second page of the output starts with the Random Effects estimates. These are variance estimates (with standard errors, Wald Z test statistics, significance values, and confidence intervals for the variance estimates). Recall the ubiquitous ANOVA summary table where we generally have a total variance estimate (sums of squares) at the bottom, then just above it we have a residual or within groups variance estimate (sums of squares) and then we have each treatment or between groups variance estimate (sums of squares). This table is very much like that, but the total is not displayed and the residual variance estimate is on top. So, we can quickly calculate the total variance estimate: 95.1431 + 2.8831 + .9684 = 98.9946 then we can create an R? type of effect size to gauge the importance of each random effect by dividing the effect's variance estimate by the total variance estimate to arrive at a proportion of variance explained or accounted for by each random effect. This is analogous to an Eta-squared (η?) in standard ANOVA or an R? in regression; it is sometimes referred to (in the linear mixed effects situation) as an Intraclass Correlation Coefficient (ICC, Bartko, 1976; Bliese, 2009). For example, we find that the nested effect of classRC within schoolRC is 2.8831 / 98.9946 = 0.0291238 or simply stated, that random nested effect only accounts for 2.9% of the variance of the random effects. However, the random effect for schoolRC alone accounts for 95.1431 / 98.9946 = 0.9610938 or 96% of the variance of the random effects. If none of the random effects account for a meaningful amount of variance of the random effects (i.e. if residual is larger than the other variance estimates), then the random effects should be eliminated from the model and a standard General Linear Model (or Generalized Linear Model) should be fitted (i.e., a model with only the fixed effects).

The next part of the second page contains the fit indices; generally I use and recommend the Bayesian Information Criterion (BIC). The next part of the output displays the Fixed Effects estimates.

It should be clear, this table and its interpretation are exactly like one would expect from a traditional ordinary least squares linear regression. One thing to note is the way SPSS chooses the reference category for categorical variables. You may have noticed we have been using the classRC and schoolRC variables instead of the original class and school variables in the data set. The RC variables contain the same information as the original variables, they simply have been ReCoded or Reverse Coded so that the output here will match the output produced using the lme4 package in the R programming language. It is important to know that SAS (and SPSS) automatically choose the category with the highest numerical value (or the lowest alphabetical letter) as the reference category for categorical variables. All packages I have used in the R programming language choose the reference category in the more intuitive but opposite way. In the lme4 package (and others I've used) in R, the software automatically picks the lowest numerical value (or the earliest alphabetically letter) as the reference category for categorical variables. This has drastic implications for the intercept estimate and more troubling, the predicted values produced by a model. For example, if this same model is specified with the original variables (not reverse coded) then the Fixed Effects intercept term is 63.049612; so you can imagine how much different the predicted values would be in that model compared to this model where the intercept is 57.3839. Recall from multiple regression, the intercept is interpreted as the mean of the outcome (extro) when all the predictors have a value of zero. The predictor estimates (coefficients or slopes) are interpreted the same way as the coefficients from a traditional regression. For instance, a one unit increase in the predictor Openness to new experiences (open) corresponds to a 0.006130 increase in the outcome Extroversion (extro). Likewise, a one unit increase in the predictor Agreeableness (agree) corresponds to a 0.00774 decrease in the outcome Extroversion (extro). Furthermore, the categorical predictor classRC = 3 has a coefficient of 2.0548; which means, the mean Extroversion score of the third group of classRC (3) is 2.0548 higher than the mean Extroversion score of the last group of classRC (4). ClassRC (4) was automatically coded as the reference category. The last part of the output (which continues on to the next page) contains the Random Effects Estimates.

The next page of output continues displaying the Random Effects estimates and also shows the tests of the Fixed Effects.

As with most of the tutorials / pages within this site, this page should not be considered an exhaustive review of the topic covered and it should not be considered a substitute for a good textbook.

References / Resources

Akaike, H. (1974). A new look at the statistical model identification. I.E.E.E. Transactions on Automatic Control, AC 19, 716 – 723. Available at:

Bartko, J. J. (1976). On various intraclass correlation reliability coefficients. Psychological Bulletin, 83, 762-765.

Bates, D., & Maechler, M. (2010). Package ‘lme4’. Reference manual for the package, available at:

http://cran.r-project.org/web/packages/lme4/lme4.pdf

Bates, D. (2010). Linear mixed model implementation in lme4. Package lme4 vignette, available at:

http://cran.r-project.org/web/packages/lme4/vignettes/Implementation.pdf

Bates, D. (2010). Computational methods for mixed models. Package lme4 vignette, available at:

http://cran.r-project.org/web/packages/lme4/vignettes/Theory.pdf

Bates, D. (2010). Penalized least squares versus generalized least squares representations of linear mixed models. Package lme4 vignette, available at:

http://cran.r-project.org/web/packages/lme4/vignettes/PLSvGLS.pdf

Bliese, P. (2009). Multilevel modeling in R: A brief introduction to R, the multilevel package and the nlme package. Available at:

http://cran.r-project.org/doc/contrib/Bliese_Multilevel.pdf

Draper, D. (1995). Inference and hierarchical modeling in the social sciences. Journal of Educational and Behavioral Statistics, 20(2), 115 - 147. Available at:

Fox, J. (2002). Linear mixed models: An appendix to “An R and S-PLUS companion to applied regression”. Available at:

http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf

Gelman, A. (2005). Analysis of variance -- why it is more important than ever. The Annals of Statistics, 33(1), 1 -- 53. Available at:

http://bayes.acs.unt.edu:8083:8083/BayesContent/class/Jon/MiscDocs/Gelman_2005.pdf

Hofmann, D. A., Griffin, M. A., & Gavin, M. B. (2000). The application of hierarchical linear modeling to organizational research. In K. J. Klein (Ed.), Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions (p. 467 - 511). San Francisco, CA: Jossey-Bass. Available at:

Littell, R. C., Milliken, G. A., Stroup, W. W., & Wolfinger, R. D. (1996). SAS system for mixed models. Cary, NC: SAS Institute Inc.

Raudenbush, S. W. (1995). Reexamining, reaffirming, and improving application of hierarchical models. Journal of Educational and Behavioral Statistics, 20(2), 210 - 220. Available at:

Raudenbush, S. W. (1993). Hierarchical linear models and experimental design. In L. Edwards (Ed.), Applied analysis of variance in behavioral science (p. 459 - 496). New York: Marcel Dekker. Available at:

Rogosa, D., & Saner, H. (1995). Longitudinal data analysis examples with random coefficient models. Journal of Educational and Behavioral Statistics, 20(2), 149 - 170. Available at:

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461 – 464. Available at:

Return to the SAS Short Course

UNT home page

Contact Information
Jon Starkweather, PhD	Jonathan.Starkweather@unt.edu	940-565-4066
Richard Herrington, PhD	Richard.Herrington@unt.edu	940-565-2140

Please participate in the DSA Client Feedback Survey.

Last updated: 2018.11.15 by Jon Starkweather.