General discussion of Study 2
In the present study, we have revisited three existing data sets in conceptual processing to investigate the interplay between language-based and vision-based information. Specifically, we have investigated how this interplay is modulated by individual differences in vocabulary size, by the linguistic and visual information contained in words, and by contextual demands such as semantic depth and presentation speed. Although both language and vision played significant roles in some contexts (detailed below), the main effects and the interactions of language-based information were larger than those of vision-based information, consistent with previous research (Banks et al., 2021; Kiela & Bottou, 2014; Lam et al., 2015; Louwerse et al., 2015; Pecher et al., 1998; Petilli et al., 2021).
In our current approach, the sensorimotor domain was represented by a single variable in each study, just as the language domain was represented by a single variable. In the sensorimotor domain, we focussed on the vision to its hegemonic role in the human brain (Reilly et al., 2020) as well as in several languages (Bernabeu, 2018; I.-H. Chen et al., 2019; Lynott et al., 2020; Miceli et al., 2021; Morucci et al., 2019; Roque et al., 2015; Speed & Brybaert, 2021; Speed & Majid, 2020; Vergallito et al., 2020; Winter et al., 2018; Zhong et al., 2022). Notably, vision was also the domain chosen in a recent study that strongly influenced the present study (Petilli et al., 2021), as well as in previous studies (Bottini et al., 2021; De Deyne et al., 2021; Pearson & Kosslyn, 2015; Yee et al., 2012). In contrast to this parsimonious approach, more comprehensive alternatives could be used in future research to consider more sensorimotor domains. The first of these approaches is the preselection approach, which incorporates a step prior to the main analysis. In this prior step, a selection is performed among a large variety of word-level information, including visual, auditory and motor information, etc. (Bernabeu et al., 2021). Selecting a single variable provides a convenient way to compare the role of sensorimotor information to that of linguistic information, if the latter is also represented by a single variable. The second approach is using a variable that aggregates sensorimotor information (Wingfield & Connell, 2022a). Last, the third approach would be using more than one variable to represent sensorimotor information in the main analysis. This would complicate the analysis of interactions with other variables, as the overall number of terms in the model could quickly exceed the maximum normally encountered in mixed-effects models—that is, around 15. If random slopes are included for all those effects of interest (see Brauer & Curtin, 2018; Singmann & Kellen, 2019), the model would most likely present convergence warnings. In the face of this challenge, authors could either probe into those warnings (see Appendix B), or could opt for different method, such as linear regression or machine learning. Ultimately, in any selection of variables, there is a trade-off between parsimony and comprehensiveness, and negotiating this trade-off often involves a certain degree of arbitrariness. A time-consuming, stepwise selection can help reduce this arbitrariness (for an example, see Appendix A).
Insofar as both ‘language’ and ‘vision’ were present in the models, it is (arguably) valid to make conclusions based on them (see Louwerse, 2011; Louwerse & Connell, 2011; Santos et al., 2011; Simmons et al., 2008). In contrast, when only one of these variables is analysed, it may contain information from the other variable. If the superiority of language is genuine—rather than due to a bogus reflection of sensorimotor information—, the present results suggest that language is the main source of information in conceptual processing, whereas sensorimotor information provides extra help, especially for higher-vocabulary individuals (see Study 2.2, Semantic decision) and in deeper semantic tasks (refer to task-relevance advantage above). As the ultimate conclusion, should sensorimotor simulation be considered smaller but nonetheless important—especially for some individuals and in some contexts—, or should it be considered a negligible by-product of conceptual processing (Mahon & Caramazza, 2008)? Although the jury is still out, the present results provide support for the tenet that sensorimotor simulation is smaller yet important, especially for some individuals and in some contexts, whereas language is important across the board.
Furthermore, it is necessary to acknowledge a longstanding caveat in the present topic, which also affects the present study. That is, it is extremely difficult to ascertain whether our variables encode what we intend for them to encode. Specifically, it is possible that the variables for language-based information encode some sensorimotor information, and vice versa. To address this caveat, future research could combine the use of continuous word-level variables—as used in the present study—with the use of brain-level measurements (see Borghesani et al., 2016). Specifically, such an investigation should examine whether language-based information is primarily circumscribed to the brain regions in charge of semantic retrieval—such as the posterior left inferior frontal gyrus, the right posterior inferior frontal gyrus, the left anterior superior temporal gyrus and sulcus, and the left middle and posterior middle temporal gyrus (Hagoort, 2017; Skeide & Friederici, 2016). Conversely, this investigation should also examine whether vision-based information is primarily circumscribed to the brain regions in charge of visual semantic information—such as Brodmann area 17, in the occipital lobe, corresponding to primary visual cortex (Borghesani et al., 2016). Due to the importance of the time course, a method that provides both spatial and temporal resolution, such as magnetoencephalography, would be ideally suited for this research. If both sources of information are largely circumscribed to their regions of interest in the brain, we could conclude that the variables are valid. In contrast, if there are drifts in the processing—whereby language-based information is consistently associated with activation in primary visual cortex, or whereby vision-based information is associated with activation in the language regions of interest—, we would need to question the validity of the variables.
As an alternative to the above design, a thriftier method would be available by using two clusters of covariates. One of these clusters would be primarily associated with language-based information, whereas the other cluster would be primarily associated with vision-based information.18 This research should examine whether the variables in each cluster all behave similarly, or whether—instead—there are any drifts between the language and vision. As in the above design, the absence of drifts would validate the operationalisation of the two sides in the dichotomy, whereas the presence of drifts would question the validity.
The present analysis controlled important sources of variance in the fixed effects and in the random effects. First, in the fixed effects, covariates such as word concreteness and individual differences in general cognition were included in the models. It was important to include these covariates as they were substantially correlated with some of our variables of interest, and research has suggested that these covariates may represent fundamentally different processes from those of our variables of interest. For instance, word concreteness and visual strength were highly correlated. However, whereas visual strength indexes a perceptual component of semantic information, word concreteness might be circumscribed to the lexical level, which does not require the processing of meaning (Bottini et al., 2021; cf. Connell & Lynott, 2012; Pexman & Yap, 2018). Similarly, it was important to control for individual differences in general cognition measures as covariates of vocabulary size (Ratcliff et al., 2010; also see James et al., 2018; Pexman & Yap, 2018). We contend that controlling (or, in other words, statistically adjusting) for important covariates is a valuable asset of our present research. Furthermore, we think that the number of covariates we selected was enough but not excessive. We did not find any signs of overfitting in the models, as the variables that have been consistently influential in the literature were also influential in our current models. To further delve into the role of covariates in conceptual processing, we think that it would be valuable to investigate how the presence and the absence of several covariates in a model can affect the effect sizes and the significance results.19 Indeed, the differences between the results of Study 2.1 (semantic priming) and the results of Petilli et al. (2021) suggested that the influence of covariates can be very important. However, because these analyses differed in other aspects of the models, a study focussed on covariates would be insightful (see Botvinik-Nezer et al., 2020; Perret & Bonin, 2019; E.-J. Wagenmakers et al., 2022).
Second, in the random effects, the models contained a maximal structure that accounted for far more variance than the fixed effects, thus providing for a conservative analysis. Indeed, the maximal random-effects structure served to impede a violation of the independence of observations (Barr et al., 2013; Brauer & Curtin, 2018; Singmann & Kellen, 2019). Specifically, random intercepts and slopes ensured that sources of dependence such as participants and stimuli were kept outside of the fixed effects, which are the relevant effects for the conclusions of this (and most other) research in conceptual processing.
The RTs of higher-vocabulary participants were influenced by a smaller number of variables than those of lower-vocabulary participants. This converges with previous findings suggesting that higher and lower-vocabulary participants are affected by different variables. In this regard, some research has suggested that the variables affecting higher-vocabulary participants most are especially relevant to the task (Lim et al., 2020; Pexman & Yap, 2018; Yap et al., 2012, 2017). Our results were consistent with the ‘task-relevance advantage’ associated with greater vocabulary knowledge. Specifically, in lexical decision, higher-vocabulary participants were more sensitive than lower-vocabulary participants to language-based information. In contrast, in semantic decision, higher-vocabulary participants were more sensitive to word concreteness. In summary, the present findings suggest that greater linguistic experience may be associated with greater task adaptabiity during cognitive performance, with better comprehenders able to selectively attend to task-relevant features compared to poorer comprehenders (Lim et al., 2020; Pexman & Yap, 2018).
In addition, the semantic priming paradigm analysed in Study 2.1 revealed that both language and vision were more important with the short SOA (200 ms) than with the long SOA (1,200 ms). This finding replicates some of the previous literature (Petilli et al., 2021) while highlighting the importance of the time course and the level of semantic processing. That is, although the finding seems to be at odds with the theory that perceptual simulation peaks after language-based associations (Barsalou et al., 2008; Louwerse & Connell, 2011), the long SOA may have been too long for perceptual simulation to be maintained in the lexical decision task that was performed by participants, which is semantically shallow (Petilli et al., 2021).
Operationalisation of variables and other analytical choices
We compared two measures of vision-based priming. The first measure—visual-strength difference
—was operationalised as the difference in visual strength (Lynott et al., 2020) between the prime word and the target word in each trial. The second measure—vision-based similarity
—, created by Petilli et al. (2021), was based on vector representations trained on images. The results revealed that both measures—including their interactions with other variables—produced similar effect sizes. This underscores the consistency that exists between human ratings and computational approximations to meaning (e.g., Charbonnier & Wartena, 2019, 2020; Günther et al., 2016b; Louwerse et al., 2015; Mandera et al., 2017; Petilli et al., 2021; Solovyev, 2021; Wingfield & Connell, 2022b). However, the effect of the human-based variable was slightly larger, which is consistent with previous comparisons of human-based and computational measures (De Deyne et al., 2016, 2019; Gagné et al., 2016; Schmidtke et al., 2018; cf. Michaelov et al., 2022; Snefjella & Blank, 2020).
In contrast to the results of Petilli et al. (2021), vision-based similarity did not significantly interact with SOA. Furthermore, in contrast to the main analysis, this sub-analysis did not present a significant interaction between language-based similarity and SOA. These two differences demonstrate how the results of our analyses can be critically influenced by analytical choices such as the operationalisation of variables and the degree of complexity of statistical models. In this regard, we must draw attention to an often-overlooked difference between the variables used to operationalise the language system—usually, text-based measures based on large corpora—and the variables used to operationalise the embodiment system—usually, human-based measures based on ratings. Critically, the literature contains many comparisons of text-based variables (De Deyne et al., 2013, 2016; Günther et al., 2016b, 2016a; M. N. Jones et al., 2006; Lund & Burgess, 1996; Mandera et al., 2017; Mikolov et al., 2013; Wingfield & Connell, 2022b), whereas the work on embodiment variables is more sparse and tends to compare different modalities—e.g., valence, visual strength, auditory strength, etc. (Lynott et al., 2020; Lynott & Connell, 2009; Newcombe et al., 2012; for an exception, see Vergallito et al., 2020). This accident of history might in part account for the superiority of linguistic information over embodied information (see Banks et al., 2021; Kiela & Bottou, 2014; Lam et al., 2015; Louwerse et al., 2015; Pecher et al., 1998; Petilli et al., 2021). Therefore, it may be important to consider whether engineering work should be devoted to the betterment of embodiment variables. On a more general conclusion, the present results suggest that research findings are fundamentally dependent on research methods.
Statistical power
Power analyses were performed to estimate the sample sizes required to reliably investigate a range of effects. The results suggested that 300 participants were sufficient to examine the effect of language-based information contained in words, whereas more than 1,000 participants were necessary for the effect of vision-based information and for the interactions of both former variables with vocabulary size, gender and presentation speed. Regarding interactions specifically, The large sample sizes required to investigate some of the effects relevant to embodied cognition and individual differences are not easily attainable with the usual organisation of funding in Psychology and Neuroscience.
References
Thesis: https://doi.org/10.17635/lancaster/thesis/1795.
Online book created using the R package bookdown.