Data Analytics Module
Lecturer: Hans van der Zwan
Research wk 02 (continued from wk 01)
Topic: summary statistics/ graphical data analysis/ visualization

Research assignment 2

In the first assignment graphs are based on the official published covid-19 figures - cases and deaths - in The Netherlands and in Belgium. The difficulty in comparing these numbers is that the way covid-19 cases and deaths are counted differs between the two countries.
To compare the effect of the pandemic, probably the most objective metric is excess mortality.
Excess mortality is defined as the difference between the expected mortality and the real mortality in the chosen time period. The expected mortality is based on historical mortality figures.

Historical mortality numbers per week for 25 countries can be found on this website. The file stmf.xlsx can be doewnloaded here. Use this file for Assignment 1-2.

Assignment 1-2

Choose two countries.
Prepare a presentation in which you:

  • present and comment on the difference between reported number of covid-19 deaths in the period week 10 till week 34 in 2020
  • present and comment on the difference between excess mortality between the two countries in the period week 10 till week 34 in 2020

Elucidation
Expected weekly mortality, the expected number of deaths in a given calenderweek, plays an important role in comparing the number of covid-19 deaths in two countries. There is not one unique definition for expected mortality. Usually it is based on the historical number of deaths in the last 5 or 10 years.
A good choice is to use the median of the mortality rates from 2009 till 2019 as the expected mortality rate in 2020.

In the presented data in the mortality.xlsx file, mortality rate is defined as yearly mortality rate. The mortality rate for a certain week is calculated as 52 times the number of deaths in a certain week, divided by the population size at January 1 of the specific year.

In this analysis the two countries are compared; the reported numbers of deaths caused by covid-19 are compared as are the excess mortality numbers in the two countries.

The analysis has to be performed for the total population; analysis per gender or age group is not asked for.

  1. Visualize the weekly mortality rates for these two countries from 2009 till 2020; in case the numbers for the chosen countries are not available from 2009 on, choose the years which are available.
  2. Create a table with columns: COUNTRY, WEEK, DEATHS, EXPECTED-DEATHS, EXCESS_DEATHS, COVID_DEATHS, with: COUNTRY: the name of the country
    WEEK: the week number in 2020; values 10, 11, …., 34 DEATHS: the absolute number of deaths EXPECTED_DEATHS: the expected absolute number of deaths EXCESS_DEATHS: the difference between DEATHS and EXPECTED_DEATHS COVID_DEATHS: the reported number of deaths caused by covid-19
  3. Create a graph with the weeks mapped on the X-axes and EXCESS_DEATHS and CIVID_DEATHS on the Y-axes.