Ideal for graduate-level courses in mixed statistical modeling, the book is also an excellent reference for professionals in a range of fields, including cancer research, computer science, and engineering.
State-of-the-art methodologies are discussed, among them:
Linear mixed-effects model
Linear growth curve model
Generalized linear growth curve model
Robust mixed model
Models with linear covariance structure
Models for binary and count clustered data (logistic, probit, Poisson)
Generalized estimating equations approach
Nonlinear mixed model
A chapter on diagnostics provides a comprehensive introduction of linear and nonlinear statistical models. Special attention is given to I-influence analysis with lots of examples. Algorithms and their implementation are discussed in detail. Several appendices make the text self-contained. Innovative applications include tumor regrowth and statistical analysis of shapes and images.
This book also discusses:
Modeling of complex clustered or longitudinal data
Modeling data with multiple sources of variation
Modeling biological variety and heterogeneity
Mixed model as a compromise between the frequentist and Bayesian approaches
Mixed model for the penalized log-likelihood
Healthy Akaike Information Criterion (HAIC)
How to cope with parameter multidimensionality
How to solve ill-posed problems including image reconstruction problems
Modeling of ensemble shapes and images
Statistics of image processing
Preface to the Second Edition
Time proved that mixed model is an indispensable tool in studying multilevel and clustered data. Mixed model became one of the mainstreams of moderns statistics, on both the theoretical and practical fronts. Several books on the topic have been published since the first edition; see Section 1.16 for a comprehensive list. Most of these books target applications of mixed models and illustrate the examples with popular statistical software packages, such as SAS and R. This book has a distinct theoretical and research flavor. It is intended to explain what is "under the hood" of the mixed model methodology. In particular, it may be used for educational purposes by graduate and Ph.D. students in statistics.
Two major additions have been made in the second edition:
Each section ends with a set of problems that should be important for an active understanding of the material. There are two type of problems: unmarked problems are regular problems, and problems marked with an asterisk are more difficult and are broader in scope. Usually, they involve an analytical derivation with further empirical confirmation through simulations. In many cases, I deliberately left the solution plan open so that students, together with their instructors, could use their own interpretation, and address questions to different depths. Some problems could be used for graduate or even Ph.D. research.
Most parts of the theoretical material and methods of estimation are accompanied by respective R codes. While the first edition used S-Plus/S+, the second edition switches to the R language. The data sets and R codes can be downloaded using the button below.
It is suggested that they be saved on the hard drive in the directory C:\MixedModels\ with a subdirectory that corresponds to the chapter in the book. All the codes can be distributed and modified freely.
The theory of mixed models has several important unsolved problems. I hope that the list that follows will stimulate research in this direction.
Why Mixed Models?
Big ideas have many names and applications. Sometimes the mixed model is called the model for repeated measurements, sometimes a hierarchical model. Sometimes the mixed model is used to analyze clustered or panel data, sometimes longitudinal data.
Mixed model methodology brings statistics to the next level. In classical statistics a typical assumption is that observations are drawn from the same general population, are independent and identically distributed. Mixed model data have a more complex, multilevel, hierarchical structure. Observations between levels or clusters are independent, but observations within each cluster are dependent because they belong to the same subpopulation. Consequently, we speak of two sources of variation: between clusters and within (intra)-cluster variance.
Mixed model is well suited for the analysis of longitudinal data, where each time-series constitutes an individual curve, a cluster. Mixed model is well suited for biological and medical data, which display notorious heterogeneity of responses to stimuli and treatment. An advantage of the mixed model is the ability to genuinely combine the data by introducing multilevel random effects. Mixed model is a nonlinear statistical model, mainly due to the presence of variance parameters, and thus it requires special theoretical treatment. The goal of this book is to provide systematic coverage and development of all spectra of mixed models, linear, generalized linear and nonlinear.
Novel mixed model applications include biologically based regrowth curves to model tumor growth after treatment, shapes, and images. The impetus for mixed model is the observations that responses to treatment are subject-specific yet pertaining a general pattern. Shapes of different subjects (such leaves from the same type of tree) may be quite different but similar at the same time like these maple leaves we analyze in the book.
An important feature of the book is that it provides numerical algorithms as a realization of statistical methods that it develops. We strongly believe that an approach is not valuable without an appropriate efficient algorithm. Each chapter ends with a summary points section that may help the reader to quickly grasp the chapter's major points.
Summary Points to Chapter 1
Often, data have a clustered (panel or tabular) structure. Classical statistics assumes that observations are independent and identically distributed (iid). Applied to clustered data, this assumption may lead to false results. In contrast, the mixed effects model treats clustered data adequately and assumes two sources of variation, within cluster and between clusters. Two types of coefficients are distinguished in the mixed model: population-averaged and cluster (or subject) - specific. The former have the same meaning as in classical statistics, but the latter are random and are estimated as posteriori means.
The linear mixed effects (LME) model may be viewed as a generalization of the variance component (VARCOMP) and regression analysis models. When the number of clusters is small and the number of observations per cluster is large, we treat the cluster-specific coefficients as fixed and ordinary regression analysis with dummy variables applies, as in the ANOVA model. Such a model is called a fixed effects model. Vice versa, when the number of clusters is large but the number of observations per cluster is relatively small, a random effects model would be more adequate then the cluster-specific coefficients are random.
The mixed model technique is a child of the marriage of the frequentist and Bayesian approaches. Similar to the Bayesian approach, a mixed model specifies the model in a hierarchical fashion, assuming that parameters are random. However, unlike the Bayesian approach, hyperparameters are estimated from the data as in the frequentist approach. As in the Bayesian approach, one has to make a decision as to the prior distribution, but that distribution may contain unknown parameters that are estimated from the data, as in the frequentist approach.
Penalized likelihood is frequently used to cope with parameter multidimensionality. We show that the penalized likelihood may be derived from a mixed model as an approximation to the marginal likelihood after applying the Laplace approximation. Moreover, the penalty coefficient, often derived from a heuristic procedure, is estimated by maximum likelihood as an ordinary parameter.
The Akaike information criterion (AIC) is used to compare statistical models and to choose the most informative. The AIC has the form of a penalized log-likelihood with the penalty equal to the dimension of the parameter vector. A drawback of the AIC is that it does not penalize ill-posed statistical problems, as in the case of multicollinearity among explanatory variables in linear regression. We develop a healthy AIC that copes with ill-posedness as well because the penalty term involves the average length of the parameter vector. Consequently, among models with the same log-likelihood value and number of parameters, HAIC will choose the model with the shortest parameter vector length.
Since the mixed model naturally leads to penalized likelihood, it can be applied to penalized smoothing and polynomial fitting. Importantly, the difficult problem of penalty coefficient selection is solved by the mixed model technique by estimating this coefficient from the data. In penalized smoothing, we restrain the parameters through the bending energy, in polynomial fitting through the second derivative.
The mixed model copes with parameter multidimensionality. For example, if a statistical model contains a large number of parameters, one may assume that a priori parameters have zero mean and unknown variance. Estimating this variance from the data, after Laplace approximation we come to the penalized log-likelihood. We illustrate this approach with a dietary problem in conjunction with logistic regression where the number of food items consumed may be large.
Tikhonov regularization aims to replace an ill-posed problem with a well-posed problem by adding a quadratic penalty term. However, selection of the penalty coe.cient is a problem. Although Tikhonov regularization receives a nice statistical interpretation in the Bayesian framework, the problem of the penalty coe.cient remains. A nonlinear mixed model estimates the penalty coefficient from the data along with the parameter of interest.
Computerized tomography (CT) reconstructs an image from projections and belongs to the family of linear image reconstruction. Since the number of image pixels is close to the number of observations, CT leads to an ill-posed problem. To obtain a well-posed problem, a priori assumptions on the reconstructed image should be taken into account. We show that a mixed model may accommodate various prior assumptions without complete specification of the prior distribution.
Positron emission tomography (PET) uses the Poisson regression model for image reconstruction and the EM algorithm for likelihood maximization. Little statistical hypothesis testing has been reported, perhaps due to the fact that the EM algorithm does not produce the covariance image matrix. Fisher scoring or Unit step algorithms are much faster and allow computation of the covariance matrix needed for various hypothesis testing as if two images in the area of interest are the same. To cope with ill-posedness, Bayesian methods and methods of penalized likelihood have been widely applied. The generalized linear mixed model (GLMM), studied extensively in Chapter 7, also follows the line of the Bayesian approach, but enables estimation of the regularization parameter from PET data. A multilevel GLMM model can combine repeated PET measurements and process them simultaneously increasing statistical power substantially.
The mixed model is well suited for the analysis of biological data when, on the one hand, observations are of the same biological category (maple leaf), but on the other hand, individuals differ. Consequently, there are two sources of variation: variation between individuals (intersubject variance) and variation within an individual (intrasubject variance). The common biological type corresponds to population-averaged parameters and individuality corresponds to subject-specific parameters. Shape is the simplest biological characteristic. Its analysis is complicated by the fact that shapes may be rotated and translated arbitrarily. Several mixed models for shape analysis are discussed in Chapter 11.
Image science enables us to derive large data of repeated structure; thus application of the repeated measurements model, such as a mixed model, seems natural. Until now, image comparison in medicine has been subjective and based on "eyeball" evaluation of a few images (often, just a couple). Statistical thinking in image analysis is generally poor. For example, a proper DNA Western blot image evaluation should be based on several tissue samples analyzed by a multilevel mixed model.
Mixed models can be applied for statistical image analysis, particularly to analyze an ensemble of images (see Chapter 12). As with shape analysis, two sources of variation are considered, the withinimage and between-images variation. Since an image may be described as a large matrix, we may treat the element as a nonlinear function of the index and apply the nonlinear mixed effects model of Chapter 6. The mixed model can also be applied to study the motion of fuzzy objects such as clouds.