The present studies
We revisit three larger-than-average studies to investigate the interplay between language and embodiment in conceptual processing. We devote a study to each of the three original studies. Thus, Study 2.1 is centred on Hutchison et al. (2013) and uses the semantic priming paradigm. Study 2.2 is centred on Pexman et al. (2017) and uses the semantic decision paradigm. Study 2.3 is centred on Balota et al. (2007) and uses the lexical decision paradigm. Each of these central studies contained measures of participants’ vocabulary size and gender. Furthermore, the core data sets were expanded by adding variables that captured the language-based information in words (Mandera et al., 2017; Wingfield & Connell, 2022b) and the vision-based information in words (Lynott et al., 2020; Petilli et al., 2021)—the latter being used to represent the embodiment system. One of the key questions we investigated using this array of variables was whether individual differences in vocabulary and gender modulated participants’ sensivity to the language-based and vision-based information in words. Alongside the effects of interest, several covariates were included in the models to allow a rigorous analysis (Sassenhagen & Alday, 2016). These covariates comprised measures of general cognition and lexical characteristics of the stimulus words. Last, in each study, we performed a statistical power analysis to help estimate the sample size needed to investigate a variety of effects in future studies.
Below, we delve into the language and the embodiment components of these studies.
Language
Studies have operationalised the language system at the word level using measures that capture the relationships among words without explicitly drawing on any sensory or affective modalities. Two main types of linguistic measures exist: those based on text corpora—dubbed word co-occurrence measures (Bullinaria & Levy, 2007; Petilli et al., 2021; Wingfield & Connell, 2022b)—and those based on associations collected from human participants—dubbed word association measures (De Deyne et al., 2016, 2019). Notwithstanding the interrelation between word co-occurrence and word association (Planchuelo et al., 2022), co-occurrence is more purely linguistic, whereas association indirectly captures more of the sensory and affective meaning of words (De Deyne et al., 2021).
Operationalisation and hypotheses
In Study 2.1 (semantic priming) and Study 2.2 (semantic decision), co-occurrence measures were used to represent the language system at the word level. Specifically, in Study 2.1, this measure was called language-based similarity
, and it was based on the degree of text-based co-occurrence between the prime word and the target word in each trial (Mandera et al., 2017). In Study 2.2, the measure was called word co-occurrence
, and it was based on the degree of text-based co-occurrence between each stimulus word and the words ‘abstract’ and ‘concrete’ (Wingfield & Connell, 2022b). In Study 2.3 (lexical decision), a co-occurrence measure could not be used, as the co-occurrence of words in consecutive trials could not be calculated due to the high frequency of nonword trials throughout the lexical decision task. Therefore, a single-word measure had to be used instead. Word frequency was used as it was the lexical variable, among five, that had the largest effect (see Appendix A).
At the individual level, language was represented by participants’ vocabulary size in Studies 2.1 and 2.2, and by participants’ vocabulary age in Study 2.3. Vocabulary size and age did not differ in any consequential way. They both captured the amount of vocabulary knowledge of each participant, by testing their knowledge of a small sample of pre-normed words, and thereby inferring their overall knowledge.
We hypothesised that word co-occurrence, word frequency and vocabulary size would all have facilitatory effects on participants’ performance, with higher values leading to shorter RTs (Pexman & Yap, 2018; Wingfield & Connell, 2022b; Yap et al., 2009).
Embodiment represented by vision-based information
In previous studies, the embodiment system has been represented at the word level by perceptual, motor, affective or social variables (Fernandino et al., 2022; Vigliocco et al., 2009; X. Wang et al., 2021). For instance, the perceptual modalities have often corresponded to the five Aristotelian senses—vision, hearing, touch, taste and smell (Bernabeu et al., 2017, 2021; Louwerse & Connell, 2011)—and, less often, to interoception (Connell et al., 2018). Yet, out of all these domains, vision has been most frequently used in research (e.g., Bottini et al., 2021; De Deyne et al., 2021; Pearson & Kosslyn, 2015; Petilli et al., 2021; Yee et al., 2012). The hegemony of vision is likely due to the central position of vision in the human brain (Reilly et al., 2020) as well as in several languages (Bernabeu, 2018; I.-H. Chen et al., 2019; Lynott et al., 2020; Miceli et al., 2021; Morucci et al., 2019; Roque et al., 2015; Speed & Brybaert, 2021; Speed & Majid, 2020; Vergallito et al., 2020; Winter et al., 2018; Zhong et al., 2022). In the present study, we focussed on vision alone due to three reasons. First, we wanted to use a single variable to represent sensorimotor information, just as a single variable would be used to represent linguistic information. Using a single variable for each system facilitates the analysis of interactions with other variables. Second, vision is very prominent in cognition, as we just reviewed. Third, we had planned to use the present research to determine the sample size of a subsequent study that focusses on vision (indeed, the present study grew out of a statistical power analysis).
Operationalisation and hypotheses
At the word level, we operationalised visual information using the visual strength variable from the Lancaster Sensorimotor Norms (Lynott et al., 2020). This variable measures the degree of visual experience associated with concepts. In Study 2.1, we created the variable visual-strength difference
by subtracting the visual strength of the prime word from that of the target word, in each trial. Thus, visual-strength difference measured—in each trial—how much the prime word and the target word differed in their degrees of vision-based information. Even though we could not find any previous studies that reported the effect of visual strength (or visual-strength difference) on RT, we hypothesised a priming effect underpinned by this variable, consistent with related research (Petilli et al., 2021). Specifically, we hypothesised that visual-strength difference would have an inhibitory effect on participants’ performance, with higher values leading to longer RTs.
In Studies 2.2 and 2.3, we used the visual strength
score per stimulus word. We hypothesised that this variable would have a facilitatory effect on participants’ performance—i.e., higher values leading to shorter RTs—, consistent with related research (Petilli et al., 2021).
Unlike language, vision was not examined at the individual level because the available variables were based on one self-reported value per participant (Balota et al., 2007; Hutchison et al., 2013), contrasting with the greater precision of the vocabulary measures, which consisted of multiple trials. Nonetheless, we recognise the need to investigate the role of perceptual experience (Muraki & Pexman, 2021; Plaut & Booth, 2000) alongside that of linguistic experience in the future.
Levels of analysis
Experimental data in psycholinguistics can be divided into various levels, such as individuals, words and tasks. The simultaneous examination of a theory across several levels is expected to enhance our understanding of the theory (Ostarek & Bottini, 2021)—for instance, by revealing the distribution of explanatory power (that is, effect size) within and across these levels. Several studies have probed more than one level—for instance, word level and individual level (Aujla, 2021; Lim et al., 2020; Pexman & Yap, 2018; Yap et al., 2009), or word level and task level (Al-Azary et al., 2022; Connell & Lynott, 2013, 2014a; Ostarek & Huettig, 2019; Petilli et al., 2021). This multilevel approach is complementary to a different line of research that aims to test the causality of various sources of information in conceptual processing, such as language (Ponari, Norbury, Rotaru, et al., 2018), perception (Stasenko et al., 2014) and action (Speed et al., 2017).
The three levels considered in this study—individual, word and task—are described below.
Individual level
The individual level is concerned with the role of individual differences in domains such as language, perception, mental imagery and physical experience (e.g., Daidone & Darcy, 2021; Davies et al., 2017; Dils & Boroditsky, 2010; Fetterman et al., 2018; Holt & Beilock, 2006; Mak & Willems, 2019; Miceli et al., 2022; Pexman & Yap, 2018; Vukovic & Williams, 2015; Yap et al., 2009, 2012, 2017).5 Recent studies have revealed important roles of participant-specific variables in topics where these variables have not traditionally been considered (DeLuca et al., 2019; Kos et al., 2012; Montero-Melis, 2021).
Vocabulary size is used to represent the language system at the individual level. It measures the number of words a person can recognise out of a sample. Furthermore, covariates akin to general cognition—where available—were included the models (see Covariates section below).
Word level
The word level is concerned with the lexical and semantic information in words (e.g., De Deyne et al., 2021; Lam et al., 2015; Lund et al., 1995; Lund & Burgess, 1996; Lynott et al., 2020; Mandera et al., 2017; Petilli et al., 2021; Pexman et al., 2017; Santos et al., 2011; Wingfield & Connell, 2022b). The word-level variables of interest in this study are language-based and vision-based information (both described above). The covariates are lexical variables and word concreteness. The lexical covariates were selected in each study out of the same five variables (see Covariates section below).
Task level
The task level is concerned with experimental conditions affecting, for instance, processing speed. In Study 2.1 (semantic priming), there is one task-level factor, namely, stimulus onset asynchrony (SOA), which measures the temporal interval between the onset of the prime word and the onset of the target word.6 In Studies 2.2 and 2.3, there are no task-level variables.
Beyond task-level variables, there is an additional source of task-related information across the three studies—namely, the experimental paradigm used in each study (i.e., semantic priming, semantic decision and lexical decision). Indeed, it is possible to examine how the effects vary across these paradigms (see Wingfield & Connell, 2022b). This comparison, however, must be considered cautiously due to the existence of other non-trivial differences across these studies, such as the numbers of observations. With this caveat noted, the tasks used across these studies likely elicit varying degrees of semantic depth, as ordered below (see Balota & Lorch, 1986; Barsalou et al., 2008; Becker et al., 1997; de Wit & Kinoshita, 2015; Joordens & Becker, 1997; Lam et al., 2015; Muraki & Pexman, 2021; Ostarek & Huettig, 2017; Versace et al., 2021; Wingfield & Connell, 2022b).
Semantic decision (Study 2.2) likely elicits the deepest semantic processing, as the instructions of this task ask for a concreteness judgement. In this task, participants are asked to classify words as abstract or concrete, which elicits deeper semantic processing than the task of identifying word forms—i.e., lexical decision (de Wit & Kinoshita, 2015).
Semantic priming (Study 2.1). The task administered to participants in semantic priming studies is often lexical decision, as in Study 2.1 below. The fundamental characteristic of semantic priming is that, in each trial, a prime word is briefly presented before the target word. The prime word is not directly relevant to the task, as participants respond to the target word. Nonetheless, participants normally process both the prime word and the target word in each trial, and this combination allows researchers to analyse responses based on the prime–target relationship. In this regard, this paradigm could be considered more deeply semantic than lexical decision. Indeed, slower responses in semantic priming studies—reflecting difficult lexical decisions—have been linked to larger priming effects (Balota et al., 2008; Hoedemaker & Gordon, 2014; Yap et al., 2013), revealing a degree of semantic association that has not been identified in the lexical decision task.
Lexical decision (Study 2.3) is likely the semantically-shallowest task of these three, as it focusses solely on the identification of word forms.
Hypotheses
The central objective of the present studies is the simultaneous investigation of language-based and vision-based information, along with the interactions between each of those and vocabulary size, gender and presentation speed (i.e., SOA). Previous studies have examined subsets of these effects using the same data sets we are using (Balota et al., 2007; Petilli et al., 2021; Pexman et al., 2017; Pexman & Yap, 2018; Wingfield & Connell, 2022b; Yap et al., 2012, 2017). Out of these studies, only Petilli et al. (2021) investigated both language and vision. However, in contrast to our present study, Petilli et al. did not examine the role of vocabulary size or any other individual differences, instead collapsing the data across participants.
In addition to main effects of the aforementioned variables, our three studies have four interactions in common: (1a) language-based information × vocabulary size, (1b) vision-based information × vocabulary size, (2a) language-based information × participants’ gender, and (2b) vision-based information × participants’ gender. In addition, Study 2.1 contained two further interactions: (3a) language-based information × SOA, (3b) vision-based information × SOA (note that the names of some predictors vary across studies, as detailed in the present studies section above). Each interaction and the corresponding hypotheses are addressed below.
1a. Language-based information × vocabulary size
We outline three hypotheses supported by literature regarding the interaction between language-based information and participants’ vocabulary size.
Larger vocabulary, larger effects. Higher-vocabulary participants might be more sensitive to linguistic features than lower-vocabulary participants, thanks to a larger number of semantic associations (Connell, 2019; Landauer et al., 1998; Louwerse et al., 2015; Paivio, 1990; Pylyshyn, 1973). For instance, Yap et al. (2017) revisited the semantic priming study of Hutchinson and Louwerse (2013) and observed a larger semantic priming effect in higher-vocabulary participants.
Larger vocabulary, smaller effects. Higher-vocabulary participants might be less sensitive to linguistic features, thanks to a more automated language processing (Perfetti & Hart, 2002). Some of the evidence aligned with this hypothesis was obtained by Yap et al. (2009), who observed a smaller semantic priming effect in higher-vocabulary participants. Similarly, Yap et al. (2012) found that higher-vocabulary participants in a lexical decision task (Balota et al., 2007) were less sensitive to a cluster of lexical and semantic features (i.e., word frequency, semantic neighborhood density and number of senses).
Larger vocabulary, more task-relevant effects. Higher-vocabulary participants might present a greater sensitivity to task-relevant variables, borne out of their greater linguistic experience, relative to lower vocabulary participants. This would be consistent with the findings of Pexman and Yap (2018), who revisited the semantic decision study of Pexman et al. (2017). The semantic decision task of the Pexman et al. consisted of classifying words as abstract or concrete. Pexman and Yap found that word concreteness—a very relevant source of information for this task—was more influential in higher-vocabulary participants than in lower-vocabulary ones. In contrast, word frequency and age of acquisition—-not as relevant to the task–were more influential in lower-vocabulary participants (also see Lim et al., 2020). In our present studies, we set our hypotheses regarding the ‘task-relevance advantage’ by working under the assumption that the language-based information in words—represented by one variable in each study—is important for the three tasks, given the large effects of language across tasks (Banks et al., 2021; Kiela & Bottou, 2014; Lam et al., 2015; Louwerse et al., 2015; Pecher et al., 1998; Petilli et al., 2021). Therefore, the relevance hypothesis predicts that higher-vocabulary participants—compared to lower-vocabulary ones—will be more sensitive to language-based information (as represented by
language-based similarity
in Study 2.1,word co-occurrence
in Study 2.2, andword frequency
in Study 2.3).
1b. Vision-based information × vocabulary size
To our knowledge, no previous studies have investigated the interaction between vision-based information and participants’ vocabulary size. We entertained two hypotheses. First, lower-vocabulary participants might be more sensitive to visual strength than higher-vocabulary participants. In this way, lower-vocabulary participants might compensate for the disadvantage on the language side. Second, we considered the possibility that there were no interaction effect.
2a. Language-based information × gender
We entertained two hypotheses regarding the interaction between language-based information and participants’ gender: (a) that the language system would be more important in female participants than in males (Burman et al., 2008; Hutchinson & Louwerse, 2013; Jung et al., 2019; Ullman et al., 2008), and (b) that this interaction effect would be absent, as a recent review suggested that gender differences are negligible in the general population (Wallentin, 2020).
2b. Vision-based information × gender
To our knowledge, no previous studies have investigated the interaction between vision-based information and participants’ gender. We entertained two hypotheses. Our first hypothesis was that this interaction would stand opposite to the interaction between language and gender. That is, if female participants were to present a greater role of language-based information, male participants would present a greater role of vision-based information, thereby compensating for the disadvantage on the language side. Our second hypothesis was the absence of this interaction effect (see Wallentin, 2020).
3a. Language-based information × SOA
Previous research predicts that language-based information will have a larger effect with the short SOA than with the long one (Lam et al., 2015; Petilli et al., 2021)), which also aligns with research demonstrating the fast activation of language-based information (Louwerse & Connell, 2011; Santos et al., 2011; Simmons et al., 2008).
3b. Vision-based information × SOA
The interaction between vision-based information and SOA allows three hypotheses. First, some previous research predicts that the role of vision-based information will be more prevalent with the long SOA than with the short one (Louwerse & Connell, 2011; Santos et al., 2011; Simmons et al., 2008; also see Barsalou et al., 2008). Second, in contrast, other research (Petilli et al., 2021) based on the same data that we are analysing (Hutchison et al., 2013) predicts vision-based priming only with the short SOA (200 ms), and not with the long one (1,200 ms). Third, other research does not predict any vision-based priming effect (Hutchison, 2003; Ostarek & Huettig, 2017; Pecher et al., 1998; Yee et al., 2012). In this regard, some studies have observed vision-based priming when the task was preceded by another task that required attention to visual features of concepts (Pecher et al., 1998; Yee et al., 2012), but the present data (Hutchison et al., 2013) does not contain such a prior task.
Language and vision across studies
Next, we consider our hypotheses regarding the role of language and vision across studies. Yet, before addressing those, we reiterate that caution is required due to the existence of other differences across these studies, such as the number of observations. First, we hypothesise that language-based information will be relevant in the three studies due to the consistent importance of language observed in past studies (Banks et al., 2021; Kiela & Bottou, 2014; Lam et al., 2015; Louwerse et al., 2015; Pecher et al., 1998; Petilli et al., 2021). Second, the extant evidence regarding vision-based information is less conclusive. Some studies have observed effects of vision-based information (Connell & Lynott, 2014a; Flores d’Arcais et al., 1985; Petilli et al., 2021; Schreuder et al., 1984), whereas others have not (Hutchison, 2003; Ostarek & Huettig, 2017), and a third set of studies have only observed them when the critical task was preceded by a task that required attention to visual features of concepts (Pecher et al., 1998; Yee et al., 2012). Based on these precedents, we hypothesise that vision-based information will be relevant in semantic decision, whereas it might or might not be relevant in semantic priming and in lexical decision.
Statistical power analysis
Statistical power depends on the following factors: (1) sample size—comprising the number of participants, items, trials, etc.—, (2) effect size, (3) measurement variability and (4) number of comparisons being performed. Out of these, sample size is the factor that can best be controlled by researchers (Kumle et al., 2021). The three studies we present below, containing larger-than-average sample sizes, offer an opportunity to perform an a-priori power analysis to help determine the sample size of future studies (Albers & Lakens, 2018).
Motivations
Insufficient statistical power lowers the reliability of effect sizes, and increases the likelihood of false positive results—i.e., Type I errors—as well as the likelihood of false negative results—i.e., Type II errors (Gelman & Carlin, 2014; Loken & Gelman, 2017; Tversky & Kahneman, 1971; von der Malsburg & Angele, 2017). For instance, Vasishth and Gelman (2021) illustrate how, in low-powered studies, effect sizes associated with significant results tend to be overestimated (also see Vasishth, Mertzen, et al., 2018).
Over the past decade, replication studies and power analyses have uncovered insufficient sample sizes in psychology (Brysbaert, 2019; Heyman et al., 2018; Lynott et al., 2014; Montero-Melis et al., 2017, 2022; Rodríguez-Ferreiro et al., 2020; Vasishth, Mertzen, et al., 2018). In one of these studies, Heyman et al. (2018) demonstrated that increasing the sample size resulted in an increase of the reliability of the estimates, which in turn lowered the Type I error rate and the Type II error rate—i.e., false negative and false positive results, respectively. Calls for larger sample sizes have also been voiced in the field of neuroscience. For instance, Marek et al. (2022) estimated the sample size that would be required to reliably study the mapping between individual differences—such as general cognition—and brain structures. The authors found that the current median of 25 participants in each of these studies contrasted with the thousands of participants—around 10,000—that would be needed for a well-powered study (also see Button et al., 2013).
More topic-specific power analyses are necessary due to several reasons. First, power analyses provide greater certainty on the reasons behind non-replications (e.g., Open Science Collaboration, 2015), and behind non-significant results at large. Non-replications are not solely explained by methodological differences across studies, questionable research practices and publication bias (C. J. Anderson et al., 2016; Barsalou, 2019; Corker et al., 2014; Gilbert et al., 2016; Williams, 2014; Zwaan, 2014; also see Tiokhin et al., 2021). In addition to these factors, a lack of statistical power can cause non-replications and non-significant results (see Loken & Gelman, 2017; Vasishth & Gelman, 2021).
Regarding non-significant results, it is worthwhile to consider some examples from research on individual differences. In this literature, there is a body of non-significant results, both in behavioural studies (Daidone & Darcy, 2021; Hedge et al., 2018; Muraki & Pexman, 2021; Ponari, Norbury, Rotaru, et al., 2018; Rodríguez-Ferreiro et al., 2020; for a Bayes factor analysis, see Rouder & Haaf, 2019) and in neuroscientific studies (Diaz et al., 2021). A greater availability of power analyses within this topic area and others will at least shed light on the influence of statistical power on the results. Furthermore, power analyses facilitate the identification of sensible sample sizes for future studies. Last, it should be noted that although increasing the statistical power comes at a cost in the short term, power analyses will help maximise the use of research funding in the long term by fostering more replicable research (see Vasishth & Gelman, 2021; remember Open Science Collaboration, 2015).
References
According to Lamiell (2019), ‘individual differences’ is a misnomer in that the analyses used to examine those (e.g, regression) are not participant-specific. While this may partly hold for the current study too, the use of by-participant random effects increases the role of individuals in the analysis.↩︎
The names of all variables used in the analyses were slightly adjusted for this text to facilitate their understanding—for instance, by replacing underscores with spaces (conversions reflected in the scripts available at http://doi.org/10.17605/OSF.IO/UERYQ). One specific case deserves further comment. We use the formula of the SOA in this paper, instead of the ‘interstimulus interval’ (ISI)—which we used in the analysis—, as the SOA has been more commonly used in previous papers (e.g., Hutchison et al., 2013; Pecher et al., 1998; Petilli et al., 2021; Yap et al., 2017). In our analysis, we used the ISI formula as it was the one present in the data set of Hutchison et al. (2013)—retrieved from https://www.montana.edu/attmemlab/documents/all%20ldt%20subs_all%20trials3.xlsx. The only difference between these formulas is that the ISI does not count the presentation of the prime word. In the current study (Hutchison et al., 2013), the presentation of the prime word lasted 150 ms. Therefore, the 50 ms ISI is equivalent to a 200 ms SOA, and the 1,050 ms ISI is equivalent to a 1,200 ms SOA. The use of either formula in the analysis would not affect our results, as the ISI conditions were recoded as -0.5 and 0.5 (Brauer & Curtin, 2018).↩︎
Thesis: https://doi.org/10.17635/lancaster/thesis/1795.
Online book created using the R package bookdown.