6 Conclusion and Future Work
Overall, we have found that Premier League goal scoring fits the characteristics of a Poisson process. Our first result was a Poisson distribution can be used to predict the number of matches with each number of goals scored. Additionally, the time between each individual goal in a season can be described by an exponential distribution. We also have evidence that the goal scoring time positions after being standardized are uniformly distributed.
We also used different sets of data prior to the 2018-19 Premier League season, namely, data from all seasons before, data from only the 2010s, and data from all previous years but assigning more weight to recent competitions, to predict what would happen in 2018-19. We got each team’s goal scoring rate at home and away from home by doing Poisson regression, and then performed simulations using those rate parameters. Different team metrics like how many points each team got and what place each team finished were being kept track of from the simulations, and then we make use of those variables to analyze and compare our models of different data.
In the future, there are some other topics we could explore, including:
Besides the number of goals scored, there are many other factors that can be used to determine outcomes of football matches. For our Poisson model, we only used the response variable alone - the number of goals scored - and tried to fit it with the Poisson distribution. In future research, we could use various factors to help predict goal scoring and find out if they will be as helpful as using just number of goals. We could look into variables that are likely to contribute to the result of each individual game of Premier League soccer like clean sheets, possession time, pass accuracy, shots on target, and numerous other soccer statistics. On top of that, we could compare different models with different predictors and evaluate them to find out which set of variables best predicts league outcomes, and then use them to get the match results.
In football and many other sports, team’s performance tends to vary throughout a season and across seasons. Some Premier League teams have the tendency of getting hot in early months, some clubs reach their peak during the middle period of the season (which is commonly known as the Christmas marathon period, or the festive fixtures), and a few others are more likely to do better at the end of the season. Winning and losing streaks are also important factors in sports, and some clubs are streaky, while others tend to be more consistent. Thus, in future research, we could apply team’s results from past games within the season, and maybe find a way to emphasize winning and losing streaks, to predict the outcome of later matches. As a follow up, we could investigate on models performance throughout the season. Some models may work better and predict more accurate results at certain times in the year than others.
In addition to predicting match results, another popular application of statistical modeling in sports analytics is determining betting odds. We could use the Poisson probabilities from our model to calculate the odds of possible game outcomes for different team matchups. We could also look into and compare different types of bets such as over and under, money line wager, or point spread. Accordingly, we can determine if it’s a good idea to bet on a match, and if so, how much profit we could win.