Linear Mixed-Effects Models

suppressPackageStartupMessages(library(AER))
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(broom))
suppressPackageStartupMessages(library(nlme))
suppressPackageStartupMessages(library(lme4))
suppressPackageStartupMessages(library(lmerTest))

Introduction

Basic regression models are fitted with a sample of \(n\) independent elements. Given a set of \(p\) regressors \(X_{i,j}\) and a continuous response \(Y_i\), we fit a model

\[Y_i = \beta_0 + \beta_1 X_{i,1} + \beta_2 X_{i,2} + \ldots + \beta_p X_{i,p} + \varepsilon_{i} \; \; \; \; \text{for} \; i = 1, \ldots, n\]

The coefficients \(\beta_0, \dots, \beta_p\) are fixed and constant for all the observed values \(\left(x_{i,1}, \dots, x_{i,p}, y_i\right)\).

These coefficients are called fixed effects. It is of our interest to evaluate whether they are statistically significant or not on the response.

2.1. Grunfeld’s Investment Dataset

Consider the following example: to study how gross investment depends on the firm’s value and capital stock, Grunfeld (1958) collected data from eleven different companies over the years 1935-1954.

The data frame Grunfeld contains 220 observations from a balanced panel of 11 firms from 1935 to 1954 (20 observations per firm). The dataset includes a continuous response investment subject to two explanatory variables, market_value and capital.

Firstly, we will load the data which has the following variables:

  • investment: the gross investment in millions of dollars (additions to plant and equipment along with maintenance), a continuous response.
  • market_value: the firm’s market value in millions of dollars, a continuous explanatory variable.
  • capital: stock of plant and equipment in millions of dollars, a continuous explanatory variable.
  • firm: a nominal explanatory variable with eleven levels indicating the firm (General Motors, US Steel, General Electric, Chrysler, Atlantic Refining, IBM, Union Oil, Westinghouse, Goodyear, Diamond Match, and American Steel).
  • year: the year of the observation (it will not be used in our analysis).
data(Grunfeld)
Grunfeld <- Grunfeld %>% rename(investment = invest, market_value = value)
head(Grunfeld)
##   investment market_value capital           firm year
## 1      317.6       3078.5     2.8 General Motors 1935
## 2      391.8       4661.7    52.6 General Motors 1936
## 3      410.6       5387.1   156.9 General Motors 1937
## 4      257.7       2792.2   209.2 General Motors 1938
## 5      330.8       4313.2   203.4 General Motors 1939
## 6      461.2       4643.9   207.2 General Motors 1940
tail(Grunfeld)
##     investment market_value capital           firm year
## 215      6.433       39.961  73.827 American Steel 1949
## 216      4.770       36.494  75.847 American Steel 1950
## 217      6.532       46.082  77.367 American Steel 1951
## 218      7.329       57.616  78.631 American Steel 1952
## 219      9.020       57.441  80.215 American Steel 1953
## 220      6.281       47.165  83.788 American Steel 1954
Hossameldin Mohammed
Hossameldin Mohammed
Senior Machine Learning Engineer

My research interests include traffic safety, traffic simulation, transportation demand modeling, generative machine learning, imitation learning, deep learning and experimental design.