๐ Solutions for example tasks
This is where youโll find solutions for all of the tutorials (mostly after we have discussed them in the seminar).
Solutions for Tutorial 3
Solutions for Tutorial 4
Task 4.1
Create a data frame called data. The data frame should contain the following variables (in this order):
- a vector called food. It should contain 5 elements, namely the names of your five favourite dishes.
- a vector called description. For every dish mentioned in food, please describe the dish in a single sentence (for instance, if the first food you describe is โpizzaโ, you could write: โThis is an Italian dish, which I prefer with a lot of cheese.โ)
- a vector called rating. Rate every dish mentioned in food with 1-5 (using every number only once), i.e., by rating your absolute favorite dish out of all five with a 1 and your least favorite dish out of all five with a 5.
Solution:
data <- data.frame("food" = c("pizza", "pasta", "ice cream", "crisps", "passion fruit"),
"description" = c("Italian dish, I actually prefer mine with little cheese",
"Another Italian dish",
"The perfect snack in summer",
"Potatoes and oil - a luxurious combination",
"A fruit that makes me think about vacation"),
"Rating" = c(3,1,2,4,5))
data
## food description Rating
## 1 pizza Italian dish, I actually prefer mine with little cheese 3
## 2 pasta Another Italian dish 1
## 3 ice cream The perfect snack in summer 2
## 4 crisps Potatoes and oil - a luxurious combination 4
## 5 passion fruit A fruit that makes me think about vacation 5
Task 4.2
Can you sort the data in your data set by rating - with your favorite dish (i.e., the one rated โ1โ) on top of the list and your least favourite dish (i.e., the one rated โ5โ) on the bottom?
Important: You do not yet know this command - youโll have to google for the right solution. Please do and note down the exact search terms you used for googling, so we can discuss them next week.
Solution:
## food description Rating
## 1 pasta Another Italian dish 1
## 2 ice cream The perfect snack in summer 2
## 3 pizza Italian dish, I actually prefer mine with little cheese 3
## 4 crisps Potatoes and oil - a luxurious combination 4
## 5 passion fruit A fruit that makes me think about vacation 5
Solutions for Tutorial 5
Task 5.1
Read the data set into R. Writing the corresponding R code, find out
- how many observations and how many variables the data set contains.
Solution:
## [1] 85
## [1] 13
Task 5.2
Writing the corresponding R code, find out
- how many candy bars contain chocolate.
- how many candy bars contain fruit flavor.
Solution:
##
## 0 1
## 48 37
##
## 0 1
## 47 38
Task 5.3
Writing the corresponding R code, find out
- the name(s) of candy bars containing both chocolate and fruit flavor.
Solution:
data %>%
#filter out candy bars containing both flavors
filter(chocolate == 1 & fruity == 1) %>%
#choose only the variable including the name of the candy bar
select(competitorname)
## competitorname
## 1 Tootsie Pop
Task 5.4
Create a new data frame called data_new
. Writing the corresponding R code,
- reduce the data set only to observations containing chocolate but not caramel. The data set should also only include the variables
competitorname
andpricepercent
. - round the variable
pricepercent
to two decimals. - sort the data by
pricepercent
in descending order, i.e., make sure that candy bars with the highest price are on top of the data frame and those with the lowest price on the bottom.
Solution:
data_new <- data %>%
#reduce to observations containing chocolate but **not** caramel
filter(chocolate == 1 & caramel == 0) %>%
#only include variables "competitorname" and "pricepercent"
select(competitorname, pricepercent) %>%
#round to two decimals
mutate(pricepercent = as.numeric(pricepercent)) %>%
mutate(pricepercent = round(pricepercent, 2)) %>%
#sort by price
arrange(desc(pricepercent))
Solutions for Tutorial 7
Task 7.1
Go to the Washington Post website. Using R code, download the content of the website.
Solution: (uploaded after the session)
session <- bow(url = "https://www.washingtonpost.com/",
user_agent = "Teaching project,
Valerie Hase,
Department of Media and Communication,
LMU Munich")
#Result
session
## <polite session> https://www.washingtonpost.com/
## User-agent: Teaching project,
## Valerie Hase,
## Department of Media and Communication,
## LMU Munich
## robots.txt: 113 rules are defined for 8 bots
## Crawl delay: 5 sec
## The path is scrapable for this user-agent
Task 7.2
Next, identify the headlines of all articles.
Solution: (uploaded after the session)
## [1] "In the next presidential election, some votes may matter more than others"
## [2] "U.N. warns humanitarian efforts in โtattersโ; Blinken criticizes civilian toll"
## [3] "Where should you live as you age? We asked 11 American seniors."
## [4] "Will going outside in the cold with wet hair make you sick?"
## [5] "8 mindful practices to celebrate during Hanukkah "
## [6] "A father fears heโll pass his body image issues on to his son"