Chapter 6 Discussion

Each of the disciplines that contribute to STEM learning - science, technology and computer science, engineering, and mathematics - involve work with data. In this study, engagement was used as a lens to understand the experience of youth working with data during summer STEM programs. In particular, five aspects of work with data, a) asking questions, b) observing phenomena, c) constructing measures and generating data, d) data modeling, and e) interpreting and communicating findings, were occurred regularly in the programs. There were some examples of ambitious activities centered on working with real-world data as well as some that highlight substantial heterogeneity in how work with data was enacted.

I identified six profiles of engagement using LPA. These profiles represented different configurations of how youth were working hard, learning, enjoying themselves, and feeling challenged and competent at the time they were signaled as part of the ESM approach. Relations of the five aspects of work with data and youth characteristics (pre-program interest in STEM and youths’ gender and status in terms of being a member of under-represented groups in STEM) were, overall, not strongly related with the profiles of engagement, though some key findings were identified. Generating and modeling data were both related to the most potentially beneficial profile (full engagement), one characterized by high levels of all five of the engagement variables.

This study suggests that work with data and contemporary engagement theory as interpreted in this study can serve as a frame to understand what youth do in summer STEM programs. These findings also show the value of an innovative method, ESM, and an analytic approach designed to identify engagement holistically, LPA, that together to provide some access to youths’ experience in-the-moment of the activities they were involved in during the program. Data, and how youth and students in K-12 settings can themselves work with data, is an important, yet perhaps under-emphasized part of STEM learning. In the remainder of this section, I discuss key findings with respect to a) work with data, b) youths’ engagement, and c) what relates to youths’ engagement. Also, some limitations and recommendations for future research as well as implications for practice are identified and described.

6.4 Limitations to the present study and recommendations for research

To summarize the previous sections, work with data was frequent but varied in how it was enacted and profiles of engagement representing different and interpretable configurations of five engagement-related variables were found, but work with data and youths’ characteristics were not found to be very strongly related to any of the profiles. Some limitations to the study that may provide insight into why such minimal relations to the profiles were found and into other findings are detailed in this section.

First, the programs participating in this study were not designed especially to support youth in work with data. Instead, the programs were designed around best practices for summer STEM programs to support youth to engage in a wide variety of STEM-related practices–and in other activities, such as those intended to build a sense of camaraderie among the youth in the programs. In this study, aspects of work with data were identified and were found to be common, but some of the heterogeneity in the nature of working with data may be due to this reason: Planning and instruction for the programs did not aim to foster rich work with data any more than the other activities (STEM and otherwise) that made up their programming. In addition to the varied ways in which youth worked with data, some of the relations of the variables for the five aspects of work with data to youths’ engagement may be due to the ways that the variables for work with data indicated, in fact, many different ways of working with data. Some of these aspects of working with data, particularly those that were highly-specific with respect to how the data was involved and to how focused and sustained the work with data-related activity was, may be more engaging to youth than the others, such as those that were more general, instructor-focused, or brief. These two types of working with data were considered the same in the variables used to predict youths’ engagement. Future research can aim to understand youths’ engagement in outside-of-school data science programs and K-12 units, for example, that are focused more on work with data to understand better how work with data engages youth. Nevertheless, this study does provide insight into how work with data took place during model (i.e., designed around best practices for such programs) summer STEM programs and how such work relates to youths’ engagement.

In a related point, it is important to point out that while outside-of-school STEM programs have affordances, they also have some distinct features as well as some limitations. One feature is the substantial, but still limited period of time, which was around four weeks. Another feature concerns the nature and quality of the teaching and learning that is afforded. The contexts (including in field settings) in which youth were engaged could spark their engagement and could support work with data better than some K-12 classrooms. They also have limitations, such as the chance that youth considered their time in them to be fun and to be social, rather than educational, in nature. Of course, this is not unreasonable or unexpected on the part of youth, but it may mean that the ways that youth engaged in the programs as documented in this study could be unique to outside-of-school STEM programs. In particular, engagement as reflected in the engaged and competent but not challenged profile may be unique to the experiences of youth in summer STEM programs: It may not be common in K-12 classrooms. This limitation is in addition to and in the context of those documented in earlier parts of this section, particularly, that the limited variability at the instructional episode level may also be due to the lower stakes that learners in these contexts may perceive.

Learning environments that deliberately support work with data over an extended period may demonstrate different patterns of engagement. One key reason why this may be is the importance of work with data being part of a cycle (and how this cycle often did not take place in these outside-of-school STEM programs). Nevertheless, in addition to illustrating the nature and frequency of work with data, the open-ended, qualitative coding carried out for research question #1 also provided a lens into how work with data was (or was not) sequenced. There were instances of youth activity leaders linking earlier to later activities. For instance, the mathematics-focused programs, such as the Adventures in Mathematics program, the youth activity leaders, recognizing that youth had difficulty solving equations, used duct tape–and building on an earlier activity in which youth considered what constituted a rate–asked youth to count how many “hops” it would take someone to move from one end of a line of duct tape to the other. The youth activity leader than asked youth to consider how far they could move in one hop and to consider how they could find out many hops it would take, using a mathematical equation. In this activity, youth were supported in their attempts to approach mathematics problem-solving by linking data modeling to an earlier activity that involved generating data about the number of hops.

Other instructional episodes evidenced fewer connections between earlier and later activities and also the opportunity for more sustained involvement in work with data. For example, during some instructional episodes, youth-generated data, but they did not use the data they generated in subsequent activities. In the engineering-focused programs (Uptown Architecture, Crazy Machines, and Dorchester House particularly, youth often generated data that resulted from their engineering designs (and communicated and interpreted their findings,) but did not model this data as a regular part of their activities. In one particular example, in the Ecosphere program, youth collected water samples in the field. They then brought these samples to the classroom and tested the water, involving youth in both collecting and, to a degree, generating data (by noting the pH levels of the water). However, later in the day, youth created a small-scale model (with inclined trays of dirt, rocks, and plants) of an ecosystem, in which they added food coloring to determine the impacts of chemicals and acid rain. Youth then interpreted and discussed these findings, but did not connect the discussion to the water samples youth collected and tested earlier. While these specimens were collected to serve as data for future activity, there was no generating data observed during the episode. In other instances, youth were involved in observing phenomena but were not ever asked to use those data in subsequent activities. How this sequencing of work with data may impact youths’ engagement was not considered in this study, though past research suggests that this factor may make work with data more (or less) engaging and impactful to learners. As McNeill and Berland (2017) argue, it is not just engaging in these practices by rote, but about integrating them, as they overlap and interconnect. They argue that a view of work with data focused on “making sense of” data generated from real-world phenomena, as well as sustained engagement in work with data involving the revision of earlier, intermediate ideas, are important considerations regarding the enactment of work with data.

In addition to limitations related to the focus of the programs and how work with data was enacted as part of a cycle, there were also some general measurement-related limitations. Work with data can be difficult to measure because, as the qualitative analysis revealed, there were a variety of ways in which youth can be involved in work with data. McNeill and Berland (2017) describe a similar type of disagreement across science education settings: While a limitation, the coding frame did represent agreement across a range of studies across STEM contexts for the aspects of work with data. In terms of the alignment of the measure with the conceptual framework for work with data, the dimensions of the STEM-PQA measure aligned closely with the aspects of work with data. However, there were some divergences that may have had an impact upon some of the findings. For example, for the interpreting and communicating findings code, the STEM-PQA codes for Analyze (“Staff support youth in analyzing data to draw conclusions”) and Use symbols or models (“Staff support youth in conveying STEM concepts through symbols, models, or other nonverbal language”) were used. In the case of the latter STEM-PQA code, conveying STEM concepts through symbols, models, or other nonverbal language could have reflected instructional episodes in which youth used, for example, mathematical equations or formulas, but did not do so as part of modeling data of a phenomena in the world: They could have simply been using an equation outside of the context of any particular phenomena. Future research may consider the usefulness of coding for this aspect of work with data (and this aspect of science curricular standards in particular; see NGSS Lead States, 2013).

As another example of this limitation related to how work with data was measured, generating data was an aspect of work with data that the open-ended qualitative analysis revealed to be less associated with less systematic groups of practices, or themes, than the other aspects. The STEM-PQA codes corresponding to this aspect of work with data were Collect data or measure (“Staff support youth in collecting data or measuring”) and Highlight precision and accuracy (“Staff highlight value of precision and accuracy in measuring, observing, recording, or calculating”). Particularly in the case of the latter code, the emphasis on precision and accuracy may have been outside of activities focused on recording data or creating coding frames. Future research may consider a coding frame that is (more) focused on generating data, though considerations of precision and accuracy are key aspects of doing so, and so perhaps separating the act of generating data from considerations that are important to keep in mind while doing it may be a promising direction for future research. While these divergences in measures were not large, they suggest that the coding frame for work with data is a limitation of the present study.

It is possible that the somewhat minimal findings are, in part, a result of the analytic approach. A similar mixed effects modeling approach has only been used in one other study (Strati et al., 2017), and that approach did not use profiles (as in this study) as the outcome. In this study, little variability at the instructional episode level was found, and so minimal relations between factors at this (instructional episode) level and the profiles of engagement was expected. Might profiles, but not the variables used to create them, be less variable at the instructional episode level? One way to consider such an alternate explanation is to use the data used in this study as part of correlational analyses, other analyses that use the variables used to create profiles of engagement but do not use the profiles themselves. An analysis in this spirit was reflected in the correlations including the aspects of work with data (presented in Table 4.2). These indicated very modest relations with engagement, indicating that work with data and the individual variables used to create the profiles are not related. Because of this, it is not surprising that the (more complex) mixed effects models used to explore the relations between work with data and engagement showed minimal relations. Related to pursuing a different approach to the data analysis, other outcomes from working with data may also show different (and more strongly positive or negative) relations. Such outcomes may be at the instructional episode level, like engagement, or may be longer-term, like youths’ future goals and plans after the conclusion of programs.

6.5 Implications for Practice

A few implications for practice can be drawn from this study, though these are somewhat restricted given the minimal findings. First, generating data and modeling data, in particular, may be beneficial in terms of engaging youth. Youth activity leaders (in summer STEM and other STEM enrichment contexts) and teachers (in formal learning environments) can best include the beneficial practices of generating and modeling data not in isolation, but rather through involving youth and learners in complete cycles of an investigation. This aligns with both foundational and contemporary research on work with data in education (Berland et al., 2018; McNeill & Berland, 2017; Hancock et al., 1992; Lee & Wilkerson, 2018).

Another implication concerns how work with data was enacted. As found in this study, work with data (and even specific aspects of work with data, such as asking questions) does not involve activities that are enacted in a universal way. An instructor instead of youth interpreting and communicating findings, for example, or learners asking general, conceptual questions about work with data, as another, are different from youth working to interpret findings and figuring out how to ask a question that can be answered with data, respectively. This heterogeneity suggests to those involved in planning and enacting engaging activities that involve data to consider who works with data carefully, how they do so, and how much time and sustained focus is required for such activities to be carried out. This implication aligns with recent curricular reform efforts, some of which suggest that involving work in STEM-related practices is most effective when it involves learner-driven (but instructor-supported) iterative processes of identifying a question or problem, marshaling sources of data that can be used to figure out what is happening, and developing model-based explanations that are shared with the learning community (National Governors Association, 2013; National Research Council, 2012; NGSS Lead States, 2013). While just two implications, youth activity leaders and teachers and those designing data-rich activities and evaluating the impacts of instruction based on such activities can use the findings from this study as a starting point to consider how engaging in work with data may also prepare learners to think of, understand, and take action based on data in education and in other areas of their lives.