Part C

In this part you should use the same five countries as in Part B, and analyse new COVID-19 cases per month. Now you will need to do a bit of coding yourself.

C1 Table

This code has been prepared for you and should produce a table of the number of new COVID-19 cases per million people per month for your five countries.


\newpage

# --- PART C ---

```{r}

data<-data[data$location==Country1|
           data$location==Country2|
           data$location==Country3|
           data$location==Country4|
           data$location==Country5,]
data$date1<-as.numeric(format(as.Date(data$date), "%m"))
data$date2<-as.numeric(format(as.Date(data$date), "%Y"))
data$date3<-data$date2+data$date1*0.01

country<-data$location
month<-format(data$date3,digits=2, nsmall=2)
cases<-round(data$new_cases_per_million, 0)

df<-data.frame(country,month,cases)

df<-df %>%
  group_by(country,month) %>%
  summarise(cases=sum(cases))

df<-spread(df,month,cases)

df<-with(df, df[order(df$country, decreasing = TRUE),])

kable(df,
      format="latex",
      caption="New Cases per million per Month",
      align=rep('r',5),
      booktabs=TRUE) %>%
kable_styling(latex_options =
                c("striped", "hold_position", "scale_down"))
```

C2 Heat Map

In this task you should create a heat map, that is, a color representation of the table from question C1. Knit your file and count the number of months in the table. (In the steps below we have assumed that there are 31 months, but if you have a different number, please adjust the number of months in the steps below.) Insert a new code chunk and follow the five steps below.

First, use the rep() function to create a vector that repeats the numbers 1 to 31 five times and save the vector as a variable called Month.

Second, use the rep() function to create a vector that repeats the name of each of your five countries 31 times and save the vector as a variable called Country.

Third, create five separate vectors, one for each country, of equal length (31 months) of the number of new COVID-19 cases per month, by using the c() function. Simply look at the values in the table in question C1 and write them by hand within the function. Name the vectors by the country names. Then combine your five vectors, again using the c() function, into a variable called Cases.

Fourth, use the data.frame() function to combine your three vectors (Month, Country, Cases) into a data frame and name your data frame df. Thus, your data frame will have 3 columns and 5x31=155 rows.

Tip: Use exact spelling for your variable names. The names are case sensitive.

You can also find a simplified example on how to create the data frame in this video. Use the idea in the video and apply it to your five countries, instead of two.

Fifth, use the code below to create a heat map of the spread of COVID-19.

```{r}
df %>% ggplot(aes(Month, Country, fill = Cases)) +
  geom_tile(color = "grey50") +
  scale_x_continuous(expand=c(0,0)) +
  scale_fill_gradientn(colors = brewer.pal(9, "Reds"), trans = "sqrt") +
  theme_minimal() +  
  theme(panel.grid = element_blank(), 
        legend.position="bottom", 
        text = element_text(size = 8)) +
  ggtitle("COVID-19") + 
  ylab("") + xlab("")
```

Comment briefly on the spread of COVID-19.

C3 Mean and Standard Deviation

Use the vector for Sweden and one more country to find the mean and standard deviation for the number of new COVID-19 cases per month by using the mean() and sd() functions. Comment briefly.

Tip: If your vector for Sweden and the other country contains NA-values, you can remove them by adding the na.rm = TRUE argument to your function. For example: mean(Sweden, na.rm=TRUE)