2.3 Monte Carlo Simulation
Monte Carlo simulation is one method that statisticians use to understand real-world phenomena. In Monte Carlo simulation, a model is used to generate multiple (sometimes millions) of data sets. By examining the data sets produced (or summaries of the data sets produced), researchers can draw insight about and predict what might happen in the real-world under a given set of circumstances. You can read about the fascinating origins of Monte Carlo simulation in the following article:
2.3.1 Example of a Monte Carlo Simulation Study
In 1978, China introduced the “one-child” policy in order to alleviate social, economic, and environmental problems in China. According to Wikipedia,8
The policy officially restricts the number of children married urban couples can have to one, although it allows exemptions for several cases, including rural couples, ethnic minorities, and parents without any siblings themselves. A spokesperson of the Committee on the One-Child Policy has said that approximately 35.9% of China’s population is currently subject to the one-child restriction.
Although the Chinese government has suggested that the policy has prevented more than 250 million births from its implementation to 2000, the policy is controversial both within and outside of China because of the manner in which the policy has been implemented. There have also been concerns raised about potential negative economic and social consequences, in part because many families were determined to have a son. Scholars have wondered how things would change if instead of a one-child policy, a country adopted a “one son” policy. A “one son” policy would allow families to keep having children until they had a son. If a family’s first child is a boy, they would be restricted from having more children. If, however, the first child was a daughter, the family could continue having children until a son was born. For example, they might ask the question,
If China adopted a “one son” policy, how would the policy affect the average number of children per family, which is currently 1.6?9
One way in which this question could be studied (without actually implementing the policy) would be to conduct a simulation study by modeling this situation and generating many data sets from the model. Consider for a minute how you might model the number of children a particular family would have.
One way to model this is to write the word boy on one index card and the word girl on another index card and to place those two index cards in a hat. After mixing up the index cards, you could draw a single card from the hat. If the card has the word boy written on it, the simulated “family” would be reported to have one child. If the card has the word girl written on it, a tally mark could be recorded and the index card would be replaced in the hat. The cards could then be remixed and another card would be drawn. If the second card drawn has the word boy written on it, the simulated “family” would be reported to have two children. If the second card has the word girl written on it, another tally mark could be recorded and the index card would again be replaced in the hat. This process would continue until the boy card was drawn. The table below shows the results after carrying out this process for three simulated families.
We could carry out this simulation for many families, say 500 families, and use the results to provide an answer to the research question. You can imagine that carrying out even this simple simulation would quickly become quite tedious. Simulation studies, such as this, are typically carried out using computer programs. In this unit, you will learn to use a computer program called TinkerPlots™ to model processes in the real-world and carry out simulation studies.
2.3.2 Monte Carlo Simulation Assumptions
“Wait,” you say. “Even if I carried out this simulation, I still would not be able to provide an answer to the research question! It doesn’t reflect reality! Some families may not want to have any children, while others might be happy to stop after a girl was born. What about multiple births?”
Maybe you are even questioning whether the probability of having a boy or having a girl is really 50:50. Moreover, biological sex is not binary, so it is inappropriate to have only two categories. These are all valid points, and all would likely affect the results of the simulation, which in turn affects the inferences and conclusions that are drawn.
While the model used in the “one son” example is overly simplistic for drawing any sorts of meaningful conclusions about implementing such a policy in China, it could, however, provide a useful starting point for introducing additional complexity. Even in the most enormously complicated modeling problem, researchers often make many simplifying assumptions. (Remember that all models—even those that seem quite complex—are simplifications of reality and get many things wrong.) With enough simplification, a model can be constructed and studied. The model is evaluated and often revised or updated as certain assumptions are deemed tenable and others are not. Because of this process, simulation studies are generally iterative in their development. This iteration process continues until an adequate level of understanding is developed and the research question can be answered.
2.3.3 Monte Carlo Simulation in Practice
In practice, statisticians often use incredibly complex models to generate their data. As an example, Electronic Arts, the video game company behind titles such as Madden, NHL and FIFA, uses game telemetry (the transmission of data from a game executable for recording and analysis) to model the gameplay patterns of players and identify the elements of their games that are highly correlated with player retention.10
By understanding the behavior of players and the common patterns that are used, Electronic Arts game developers can focus their attention on more relevant features in future iterations of the game and ultimately reduce production costs. For example, in their examination of Madden NFL 11, Electronic Arts used 46 features to model players’ preferences, including their control usage, performance, and play-calling style. This is but one example of using simulation in video games.