18.5 Chapter 8: Matrices and Dataframes
The following table shows the results of a survey of 10 pirates. In addition to some basic demographic information, the survey asked each pirate “What is your favorite superhero?”" and “How many tattoos do you have?”"
Name | Sex | Age | Superhero | Tattoos |
---|---|---|---|---|
Astrid | F | 30 | Batman | 11 |
Lea | F | 25 | Superman | 15 |
Sarina | F | 25 | Batman | 12 |
Remon | M | 29 | Spiderman | 5 |
Letizia | F | 22 | Batman | 65 |
Babice | F | 22 | Antman | 3 |
Jonas | M | 35 | Batman | 9 |
Wendy | F | 19 | Superman | 13 |
Niveditha | F | 32 | Maggott | 900 |
Gioia | F | 21 | Superman | 0 |
- Combine the data into a single dataframe. Complete all the following exercises from the dataframe!
piratesurvey <- data.frame(
name = c("Astrid", "Lea", "Sarina", "Remon", "Letizia", "Babice", "Jonas", "Wendy", "Niveditha", "Gioia"),
sex = c("F", "F", "F", "M", "F", "F", "M", "F", "F", "F"),
age = c(30, 25, 25, 29, 22, 22, 35, 19, 32, 21),
superhero = c("Batman", "Superman", "Batman", "Spiderman", "Batman",
"Antman", "Batman", "Superman", "Maggott", "Superman"),
tattoos = c(11, 15, 12, 5, 65, 3, 9, 13, 900, 0),
stringsAsFactors = FALSE
)
- What is the median age of the 10 pirates?
median(piratesurvey$age)
## [1] 25
- What was the mean age of female and male pirates separately?
mean(piratesurvey$age[piratesurvey$sex == "F"])
## [1] 24
mean(piratesurvey$age[piratesurvey$sex == "M"])
## [1] 32
## OR
with(piratesurvey,
mean(age[sex == "F"]))
## [1] 24
with(piratesurvey,
mean(age[sex == "M"]))
## [1] 32
## OR
mean(subset(piratesurvey,
subset = sex == "F")$age)
## [1] 24
mean(subset(piratesurvey,
subset = sex == "M")$age)
## [1] 32
- What was the most number of tattoos owned by a male pirate?
with(piratesurvey,
max(tattoos[sex == "M"]))
## [1] 9
# OR
max(subset(piratesurvey,
subset = sex == "M")$tattoos)
## [1] 9
- What percent of pirates under the age of 25 were female?
with(piratesurvey,
mean(sex[age < 25] == "F"))
## [1] 1
- What percent of female pirates are under the age of 25?
with(piratesurvey,
mean(age[sex == "F"] < 25))
## [1] 0.5
- Add a new column to the dataframe called
tattoos.per.year
which shows how many tattoos each pirate has for each year in their life.
piratesurvey$tattoos.per.year <- with(piratesurvey, tattoos / age)
- Which pirate had the most number of tattoos per year?
piratesurvey$name[piratesurvey$tattoos.per.year == max(piratesurvey$tattoos.per.year)]
## [1] "Niveditha"
- What are the names of the female pirates whose favorite piratesurvey is Superman?
piratesurvey$name[with(piratesurvey, sex == "F" & superhero == "Superman")]
## [1] "Lea" "Wendy" "Gioia"
- What was the median number of tattoos of pirates over the age of 20 whose favorite piratesurvey is Spiderman?
with(piratesurvey, (tattoos[age > 20 & superhero == "Spiderman"]))
## [1] 5