Linear Mixed Models (LMMs) are used for continuous dependent variables in whichthe residuals are normally distributed but may not correspond to the assumptionsof independence or equal variance. LMMs can be used to analyze datasets that havebeen collected with the following study designs:1. studies with clustered data, like students in classrooms;2. longitudinal or repeated-measures studies, in which subjects are measured repeatedlyover time or under different conditions.

LMMs are models that are linear in the parameters as are the more common linearmodels presented earlier, but the difference comes from that LMMs may includeboth fixed and random effects. By adding random effects the model dealswith datasets that have several responses for one subject. Fixed effects are unknownconstant parameters associated with either continuous covariates or the levels of categoricalfactors in an LMM. Estimation of these parameters in LMMs is generally ofunderlying interest as is with also linear models.

When the levels of a factor can be thought of as having been sampled from a samplespace, such that each particular level is not of intrinsic interest, the effects associatedwith the levels of those factors can be modeled as random effects in an LMM. Incontrast to fixed effects, which are represented by constant parameters in an LMM,random effects are represented by (unobserved) random variables, which are usuallyassumed to follow a normal distribution (West,Welch, and Galecki, 2006).4 Chapter 1. Theoretical background1.4.1 General specification of the modelThe general formula of an LMM, where Yti represents the continuous response variableY taken on the t-th occasion for the i-th subject, can be written as:where the upper part of the formula defines the fixed effects and latter the randomeffects of the model.

The value of t(t = 1, . . . , ni), indexes the ni longitudinal observationson the dependent variable for a given subject, and i(i = 1, . . . ,m) indicatesthe i-th subject.

The model involves two sets of covariates, namely the X and Z covariates.The first set contains p covariates, X(1), . . . , X(p), associated with the fixedeffects b1, . .

. , bp (West,Welch, and Galecki, 2006).The second set contains q covariates, Z(1), . . . , Z(q), associated with the randomeffects u1i, .

. . , uqi that are specific to subject i. The X and/or Z covariates may becontinuous or indicator variables. For each X covariate, X(1), .

. . , X(p), the termsX(1)ti , . . . , X(p)ti represent the t-th observed value of the corresponding covariate forthe i-th subject (West,Welch, and Galecki, 2006).Each b parameter represents the fixed effect as defined in the linear model formulamentioned above.

The effects of the Z covariates on the response variable arerepresented in the random portion of the model by the q random effects, u1i, . . . , uqi,associated with the i-th subject.

In addition, eti represents the residual associatedwith the t-th observation on the i-th subject. The assumption here is that for agiven subject, the residuals are independent of the random effects (West,Welch, andGalecki, 2006).1.4.2 General matrix notationThe general matrix specification of an LMM for a given subject i, is constructed bystacking the formulas in the previous section for individual observations indexed byt into vectors and matrices (throughout this section the notation ofWest,Welch, andGalecki (2006) is followed)Yi = X1b + Ziui + eiwhere Yi represents a vector of continuous responses for the i-th subject, moreoverui Nq(0,D)ei Nni (0,Ri)The elements of the Yi vector is presented as follows, drawing on the notationused for an individual observation:Note that the number of elements, ni, in the vector Yi may vary from one subject toanother.1.4.

Linear Mixed Models 5Xi is an ni p design matrix, which represents the known values of the p covariates,X(1), . . . , X(p), for each of the ni observations collected on the i-th subject:In a model including an intercept term, the first column would simply be equal to1 for all observations. Note that all elements in a column of the Xi matrix correspondingto a time-invariant (or subject-specific) covariate will be the same. Forease of presentation, it is assumed that the Xi matrices are of full rank; that is, noneof the columns (or rows) is a linear combination of the remaining ones. In general,Xi matrices may not be of full rank, and this may lead to an aliasing (or parameteridentifiability) problem for the fixed effects stored in the vector b.b is a vector of p unknown fixed-effect parameters associated with the p covariatesused in constructing the Xi matrix:The ni q Zi matrix is a design matrix that represents the known values of theq covariates, Z(1), .

. . , Z(q), for the i-th subject. This matrix is very much like the Ximatrix in that it represents the observed values of covariates; however, it usually hasfewer columns than the Xi matrix:The columns in the Zi matrix represent observed values for the q predictor variablesfor the i-th subject, which have effects on the continuous response variable thatvary randomly across subjects. In many cases, predictors with effects that vary randomlyacross subjects are represented in both the Xi matrix and the Zi matrix. Inan LMM in which only the intercepts are assumed to vary randomly from subject tosubject, the Zi matrix would simply be a column of 1’s.The ui vector for the i-th subject represents a vector of q random effects associatedwith the q covariates in the Zi matrix:By definition, random effects are random variables. It is assumed that the q randomeffects in the ui vector follow a multivariate normal distribution, with mean6 Chapter 1.

Theoretical backgroundvector 0 and a variance-covariance matrix denoted by D:ui N(0,D)Elements along the main diagonal of the D matrix represent the variances of eachrandom effect in ui, and the off-diagonal elements represent the covariances betweentwo corresponding random effects. Because there are q random effects in the modelassociated with the i-th subject, D is a q q matrix that is symmetric and positivedefinite. Elements of this matrix are shown as follows:The elements (variances and covariances) of the D matrix are defined as functionsof a (usually) small set of covariance parameters stored in a vector denoted byqD.

Note that the vector qD imposes structure (or constraints) on the elements of theD matrix.Finally, the ei vector is a vector of ni residuals, with each element in ei denotingthe residual associated with an observed response at occasion t for the i-th subject.Because some subjects might have more observations collected than others (e.g., ifdata for one or more time points are not available when a subject drops out), the eivectors may have a different number of elements.The ni residuals in the ei vector for a given subject, i, are random variables thatfollow a multivariate normal distribution with a mean vector 0 and a positive definitesymmetric covariance matrix Ri :ei N(0,Ri)It is assumed that the residuals associated with different subjects are independent ofeach other.

Furthermore, the vectors of residuals, e1, . . . , em, and random effects,u1, . .

. , um, are independent of each other. The general form of the Ri matrix asshown below: