Study 2.3: Lexical decision

The selection model for Study 2.3 served a twofold purpose. First, the variable that had the largest effect out of the five was selected as the language-based predictor of interest (see reason in Study 2.3 in the main text). Second, one variable was selected as a covariate among the remaining four.

Figure 26 shows the zero-order correlations among the lexical covariates considered in the selection.

Code

# Using the following variables...
lexicaldecision[, c('z_word_frequency', 'z_word_length', 'z_number_syllables',
                    'z_phonological_Levenshtein_distance', 
                    'z_orthographic_Levenshtein_distance')] %>%
  
  # renamed for the sake of clarity
  rename('Word frequency' = z_word_frequency,
         'Number of letters' = z_word_length,
         'Number of syllables' = z_number_syllables,
         'Phonological Levenshtein distance' = z_phonological_Levenshtein_distance,
         'Orthographic Levenshtein distance' = z_orthographic_Levenshtein_distance) %>%
  
  # Use plain names and specify 
  
  # make correlation matrix (custom function from 'R_functions' folder)
  correlation_matrix() + 
  theme(plot.margin = unit(c(0, 0, 0.05, -4.26), 'in'))

Figure 26: Zero-order correlations for the lexical covariates pretested in the lexical decision study.

Table 8 shows the results of the selection model.

Code

# Read in model and confidence intervals
KR_summary_lexical_covariates_selection_lexicaldecision_lmerTest = 
  readRDS('lexicaldecision/frequentist_analysis/lexical_covariates_selection/results/KR_summary_lexical_covariates_selection_lexicaldecision_lmerTest.rds')

confint_lexical_covariates_selection_lexicaldecision_lmerTest = 
  readRDS('lexicaldecision/frequentist_analysis/lexical_covariates_selection/results/confint_lexical_covariates_selection_lexicaldecision_lmerTest.rds')

# Rename effects in plain language...
# first, in the summary object
rownames(KR_summary_lexical_covariates_selection_lexicaldecision_lmerTest$coefficients) =
  rownames(KR_summary_lexical_covariates_selection_lexicaldecision_lmerTest$coefficients) %>%
  str_replace(pattern = 'z_word_frequency', 
              replacement = 'Word frequency') %>%
  str_replace(pattern = 'z_word_length', 
              replacement = 'Number of letters') %>%
  str_replace(pattern = 'z_number_syllables', 
              replacement = 'Number of syllables') %>%
  str_replace(pattern = 'z_orthographic_Levenshtein_distance',
              replacement = 'Orthographic Levenshtein distance') %>%
  str_replace(pattern = 'z_phonological_Levenshtein_distance', 
              replacement = 'Phonological Levenshtein distance')

# next, in the confidence intervals object
rownames(confint_lexical_covariates_selection_lexicaldecision_lmerTest) =
  rownames(confint_lexical_covariates_selection_lexicaldecision_lmerTest) %>%
  str_replace(pattern = 'z_word_frequency', 
              replacement = 'Word frequency') %>%
  str_replace(pattern = 'z_word_length', 
              replacement = 'Number of letters') %>%
  str_replace(pattern = 'z_number_syllables', 
              replacement = 'Number of syllables') %>%
  str_replace(pattern = 'z_orthographic_Levenshtein_distance',
              replacement = 'Orthographic Levenshtein distance') %>%
  str_replace(pattern = 'z_phonological_Levenshtein_distance', 
              replacement = 'Phonological Levenshtein distance')


# Create table (using custom function from the 'R_functions' folder)
frequentist_model_table(
  KR_summary_lexical_covariates_selection_lexicaldecision_lmerTest, 
  confidence_intervals = confint_lexical_covariates_selection_lexicaldecision_lmerTest,
  caption = 'Mixed-effects model for the selection of lexical covariates in the lexical decision study.') %>% 
  
  # Format
  kable_classic(full_width = FALSE, html_font = 'Cambria') %>%
  
  # Footnote describing abbreviations, random slopes, etc. 
  # LaTeX code used to format the text.
  footnote(escape = FALSE, threeparttable = TRUE, 
           # The <p> below is used to enter a margin above the footnote 
           general_title = '<p style="margin-top: 10px;"></p>', 
           general = paste('*Note*. &beta; = Estimate based on $z$-scored predictors; *SE* = standard error;',
                           'CI = confidence interval. By-participant random slopes were included for every effect.'))
Table 8: Mixed-effects model for the selection of lexical covariates in the lexical decision study.
β SE 95% CI t p
(Intercept) 0.00 0.01 [-0.01, 0.01] -0.02 .981
Word frequency -0.12 0.01 [-0.15, -0.10] -11.60 <.001
Number of letters 0.05 0.02 [0.01, 0.09] 2.73 .006
Number of syllables 0.06 0.01 [0.03, 0.09] 4.43 <.001
Orthographic Levenshtein distance 0.10 0.02 [0.05, 0.14] 4.52 <.001
Phonological Levenshtein distance -0.02 0.02 [-0.06, 0.02] -1.18 .238

Note. β = Estimate based on \(z\)-scored predictors; SE = standard error; CI = confidence interval. By-participant random slopes were included for every effect.

Considering the maximum correlation allowed (\(r\) = \(\pm\).70), the results of the model, and the use of word frequency as a predictor of interest in the model, the variable that will be included as a covariate in the main analysis is orthographic Levenshtein distance.




Pablo Bernabeu, 2022. Licence: CC BY 4.0.
Thesis: https://doi.org/10.17635/lancaster/thesis/1795.

Online book created using the R package bookdown.