5.3 At-Home Exercises

In these exercises, you will attempt to confirm the measurement model structure proposed in:

Kestilä, E. (2006). Is there demand for radical right populism in the Finnish electorate? Scandinavian Political Studies 29(3), 169–191. https://doi.org/10.1111/j.1467-9477.2006.00148.x

The data for this practical were collected during the first round of the European Social Survey (ESS). The ESS is a repeated cross-sectional survey administered in 32 European countries. The first wave was collected in 2002, and two new waves have been collected each year since. You can find more info and access the data at https://www.europeansocialsurvey.org.

The data we will analyze in this practical are contained in the file named ess_round1.rds. This file contains a processed subset of the ESS data. The following table describes the variables in this dataset.

Variable	Type	Description
name	character	Title of dataset
essround	numeric	ESS round
edition	character	Edition
proddate	character	Production date
cntry	numeric	Country
idno	numeric	Respondent’s identification number
trstlgl	numeric	Trust in the legal system
trstplc	numeric	Trust in the police
trstun	numeric	Trust in the United Nations
trstep	numeric	Trust in the European Parliament
trstprl	numeric	Trust in country’s parliament
stfhlth	numeric	State of health services in country nowadays
stfedu	numeric	State of education in country nowadays
stfeco	numeric	How satisfied with present state of economy in country
stfgov	numeric	How satisfied with the national government
stfdem	numeric	How satisfied with the way democracy works in country
pltinvt	numeric	Politicians interested in votes rather than peoples opinions
pltcare	numeric	Politicians in general care what people like respondent think
trstplt	numeric	Trust in politicians
imsmetn	numeric	Allow many/few immigrants of same race/ethnic group as majority
imdfetn	numeric	Allow many/few immigrants of different race/ethnic group from majority
eimrcnt	numeric	Allow many/few immigrants from richer countries in Europe
eimpcnt	numeric	Allow many/few immigrants from poorer countries in Europe
imrcntr	numeric	Allow many/few immigrants from richer countries outside Europe
impcntr	numeric	Allow many/few immigrants from poorer countries outside Europe
qfimchr	numeric	Qualification for immigration: christian background
qfimwht	numeric	Qualification for immigration: be white
imwgdwn	numeric	Average wages/salaries generally brought down by immigrants
imhecop	numeric	Immigrants harm economic prospects of the poor more than the rich
imtcjob	numeric	Immigrants take jobs away in country or create new jobs
imbleco	numeric	Taxes and services: immigrants take out more than they put in or less
imbgeco	numeric	Immigration bad or good for country’s economy
imueclt	numeric	Country’s cultural life undermined or enriched by immigrants
imwbcnt	numeric	Immigrants make country worse or better place to live
imwbcrm	numeric	Immigrants make country’s crime problems worse or better
imrsprc	numeric	Richer countries should be responsible for accepting people from poorer countries
pplstrd	numeric	Better for a country if almost everyone share customs and traditions
vrtrlg	numeric	Better for a country if a variety of different religions
shrrfg	numeric	Country has more than its fair share of people applying refugee status
rfgawrk	numeric	People applying refugee status allowed to work while cases considered
gvrfgap	numeric	Government should be generous judging applications for refugee status
rfgfrpc	numeric	Most refugee applicants not in real fear of persecution own countries
rfggvfn	numeric	Financial support to refugee applicants while cases considered
rfgbfml	numeric	Granted refugees should be entitled to bring close family members
gndr	numeric	Gender
yrbrn	numeric	Year of birth
edulvl	ordered, factor	Highest level of education
eduyrs	numeric	Years of full-time education completed
polintr	ordered, factor	How interested in politics
lrscale	numeric	Placement on left right scale
country	factor	A factor version of the ‘cntry’ variable
sex	factor	A factor version of the ‘gndr’ variable

5.3.1

Load the ess_round1.rds dataset.

Inspect the data after loading to make sure everything went well.

Click to show code

library(tidySEM)

dataDir <- "data"

## Read the 'ess_round1.rds' data into a data frame called 'ess': 
ess <- readRDS(here::here(dataDir, "ess_round1.rds"))

## Inspect the result:
dim(ess)

## [1] 19690    52

head(ess)

descriptives(ess)

Kestilä (2006) used principal components analysis to estimate the measurement model underlying thirteen ESS variables related to Trust in Politics. Based on the results of this PCA, Kestilä recommended the following three-factor structure.

Trust in Institutions

trstlgl: Trust in the legal system
trstplc: Trust in the police
trstun: Trust in the United Nations
trustep: Trust in the European Parliament
trustprl: Trust in country’s parliament

Satisfaction with the Political Situation

stfhlth: State of health services in country nowadays
stfedu: State of education in country nowadays
stfeco: How satisfied with present state of economy in country
stfgov: How satisfied with the national government
stfdem: How satisfied with the way democracy works in country

Trust in Politicians

pltinvt: Politicians interested in votes rather than peoples opinions
pltcare: Politicians in general care what people like respondent think
trstplt: Trust in politicians

5.3.2

Define the lavaan model syntax for the three-factor CFA described above.

Covary the three latent factors.
Do not specify any mean structure.
Save this model syntax as an object in your environment.

Click to show code

mod_3f <- '
institutions =~ trstlgl + trstplc + trstun + trstep + trstprl
satisfaction =~ stfhlth + stfedu  + stfeco + stfgov + stfdem
politicians  =~ pltinvt + pltcare + trstplt
'

5.3.3

Estimate the CFA model you defined above, summarize the results, and evaluate the model fit.

Use the lavaan::cfa() function to estimate the model.
Use the fixed-factor method to identify the model.
Otherwise, use the default settings for the cfa() function.
Request the model fit indices by supplying the fitMeasures = TRUE argument to summary().
Request the standardized parameter estimates by supplying the standardized = TRUE argument to summary().

Does the model fit the data well?

Click the code

## Load the lavaan package:
library(lavaan)

## Estimate the CFA model:
out_3f <- cfa(mod_3f, data = ess, std.lv = TRUE)

## Summarize the fitted model:
summary(out_3f, fit.measures = TRUE, standardized = TRUE)

## lavaan 0.6-19 ended normally after 25 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        29
## 
##                                                   Used       Total
##   Number of observations                         14778       19690
## 
## Model Test User Model:
##                                                        
##   Test statistic                              10652.207
##   Degrees of freedom                                 62
##   P-value (Chi-square)                            0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                             81699.096
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.870
##   Tucker-Lewis Index (TLI)                       0.837
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)            -371404.658
##   Loglikelihood unrestricted model (H1)    -366078.555
##                                                       
##   Akaike (AIC)                              742867.317
##   Bayesian (BIC)                            743087.743
##   Sample-size adjusted Bayesian (SABIC)     742995.583
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.108
##   90 Percent confidence interval - lower         0.106
##   90 Percent confidence interval - upper         0.109
##   P-value H_0: RMSEA <= 0.050                    0.000
##   P-value H_0: RMSEA >= 0.080                    1.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.059
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   institutions =~                                                       
##     trstlgl           1.613    0.018   88.396    0.000    1.613    0.677
##     trstplc           1.241    0.018   70.787    0.000    1.241    0.567
##     trstun            1.498    0.018   82.522    0.000    1.498    0.642
##     trstep            1.464    0.017   85.463    0.000    1.464    0.660
##     trstprl           1.837    0.016  112.884    0.000    1.837    0.809
##   satisfaction =~                                                       
##     stfhlth           1.173    0.019   62.813    0.000    1.173    0.521
##     stfedu            1.297    0.018   70.954    0.000    1.297    0.577
##     stfeco            1.659    0.018   92.675    0.000    1.659    0.713
##     stfgov            1.736    0.017  100.085    0.000    1.736    0.756
##     stfdem            1.623    0.017   95.805    0.000    1.623    0.731
##   politicians =~                                                        
##     pltinvt           0.646    0.008   77.685    0.000    0.646    0.613
##     pltcare           0.660    0.008   80.100    0.000    0.660    0.628
##     trstplt           1.946    0.015  125.871    0.000    1.946    0.891
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   institutions ~~                                                       
##     satisfaction      0.736    0.006  126.762    0.000    0.736    0.736
##     politicians       0.872    0.004  198.326    0.000    0.872    0.872
##   satisfaction ~~                                                       
##     politicians       0.711    0.006  115.663    0.000    0.711    0.711
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .trstlgl           3.068    0.041   75.262    0.000    3.068    0.541
##    .trstplc           3.248    0.041   80.037    0.000    3.248    0.678
##    .trstun            3.197    0.041   77.141    0.000    3.197    0.588
##    .trstep            2.776    0.036   76.243    0.000    2.776    0.564
##    .trstprl           1.776    0.029   61.360    0.000    1.776    0.345
##    .stfhlth           3.695    0.046   79.989    0.000    3.695    0.729
##    .stfedu            3.368    0.043   77.916    0.000    3.368    0.667
##    .stfeco            2.656    0.038   69.070    0.000    2.656    0.491
##    .stfgov            2.264    0.035   64.201    0.000    2.264    0.429
##    .stfdem            2.289    0.034   67.172    0.000    2.289    0.465
##    .pltinvt           0.694    0.009   78.255    0.000    0.694    0.624
##    .pltcare           0.668    0.009   77.562    0.000    0.668    0.605
##    .trstplt           0.978    0.028   34.461    0.000    0.978    0.205
##     institutions      1.000                               1.000    1.000
##     satisfaction      1.000                               1.000    1.000
##     politicians       1.000                               1.000    1.000

Click for explanation

No, the model does not seem to fit the data well \((\chi^2 = 10652.21,\) \(\textit{df} = 62,\) \(p < 0.001,\) \(\textrm{RMSEA} = 0.108,\) \(\textrm{CFI} = 0.87,\) \(\textrm{SRMR} = 0.059)\).

The SRMR looks good, but one good looking fit statistic is not enough.
The RMSEA and CFI both indicate poor fit.
The \(\chi^2\) is highly significant, but we don’t care (especially when our sample size is \(N = 14778\)).

5.3.4

Consider the factor loadings for the Trust in Politicians factor.

Do the loadings seem sensible?
Do you notice any discrepancy between the raw and standardized estimates of these loadings?
How would you explain this discrepancy (assuming you see one)?

Click for explanation

data.frame(
  raw = lavInspect(out_3f, "estimates")$lambda[11:13, 3],
  std = lavInspect(out_3f, "standardized")$lambda[11:13, 3]
  )

Hmm…something seem strange, but we need to be careful when evaluating these factor loadings.

The raw loadings seem to suggest some pretty severe heterogeneity among the loadings of the Trust in Politicians factor.
The standardized estimates don’t look too bad.

The only difference between these two versions of the loadings is the scale of the indicator variables.

The indicators for the standardized loadings all have the same unit variance.
The indicators for the raw loadings have their original scale.

Let’s check the scale of the raw indicator variables.

ess |>
  select(lavNames(out_3f)) |>
  tidySEM::descriptives() |> 
  select(name, mean, sd, min, max)

Well, there’s your problem! All the variables have an eleven-point scale except pltinvt and pltcare: they only have five-point scales. So, the factor loadings for pltinvt and pltcare should be smaller than the factor loading for trstplt. One unit of change in pltinvt or pltcare implies a large effect than one unit of change in trstplt would.

Now, we will consider an alternative factor structure for the Trust in Politics CFA. We’ll go extremely simple by estimating a one-factor model wherein all Trust items are explained by a single latent variable.

5.3.5

Define the lavaan model syntax for a one-factor model of the Trust items.

Save this syntax as an object in your environment.

Click to show code

mod_1f <- '
political_trust =~ 
  trstlgl +
  trstplc +
  trstun +
  trstep +
  trstprl +
  stfhlth +
  stfedu  +
  stfeco +
  stfgov +
  stfdem +
  pltinvt +
  pltcare +
  trstplt
'

5.3.6

Estimate the one-factor model, and summarize the results.

Identify the scale using the fixed factor method.
Don’t estimate any mean structure.

Does this model appear to fit better or worse than the three-factor model?

Click to show code

## Estimate the one factor model:
out_1f <- cfa(mod_1f, data = ess, std.lv = TRUE)

## Summarize the results:
summary(out_1f)

## lavaan 0.6-19 ended normally after 21 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        26
## 
##                                                   Used       Total
##   Number of observations                         14778       19690
## 
## Model Test User Model:
##                                                        
##   Test statistic                              17667.304
##   Degrees of freedom                                 65
##   P-value (Chi-square)                            0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                      Estimate  Std.Err  z-value  P(>|z|)
##   political_trust =~                                    
##     trstlgl             1.516    0.018   83.138    0.000
##     trstplc             1.174    0.017   67.425    0.000
##     trstun              1.411    0.018   77.904    0.000
##     trstep              1.378    0.017   80.594    0.000
##     trstprl             1.792    0.016  111.470    0.000
##     stfhlth             0.933    0.019   50.258    0.000
##     stfedu              1.054    0.018   57.700    0.000
##     stfeco              1.358    0.018   74.603    0.000
##     stfgov              1.494    0.017   85.400    0.000
##     stfdem              1.514    0.017   90.900    0.000
##     pltinvt             0.580    0.008   69.442    0.000
##     pltcare             0.600    0.008   72.700    0.000
##     trstplt             1.793    0.015  118.311    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstlgl           3.370    0.042   79.787    0.000
##    .trstplc           3.410    0.041   82.311    0.000
##    .trstun            3.451    0.043   80.749    0.000
##    .trstep            3.019    0.038   80.272    0.000
##    .trstprl           1.938    0.027   70.878    0.000
##    .stfhlth           4.201    0.050   84.093    0.000
##    .stfedu            3.941    0.047   83.419    0.000
##    .stfeco            3.565    0.044   81.289    0.000
##    .stfgov            3.044    0.038   79.326    0.000
##    .stfdem            2.631    0.034   78.072    0.000
##    .pltinvt           0.775    0.009   82.043    0.000
##    .pltcare           0.743    0.009   81.579    0.000
##    .trstplt           1.548    0.023   67.052    0.000
##     political_trst    1.000

## Compare fit statistics:
data.frame(three_factor = fitMeasures(out_3f), one_factor = fitMeasures(out_1f)) |>
  format(scientific = FALSE, digits = 3, drop0trailing = TRUE)

Click for explanation

The one-factor model definitely seems to fit worse than the three-factor model.

End of At-Home Exercises