6 Base R Practice Problems
These 10 problems are just practice problems to enhance your understanding on Base R. You don’t need to submit anything on these 10 problems. Just understand and run codes by yourself.
- Calculate the sum of square of all the observations in the vector
a
from their mean.
# deviations from the mean
# a is a vector and mean(a) is a sinlge number
# notice R used recycling rule to match different lengths
a-mean(a)
## [1] -1 0 1
## [1] 1 0 1
## [1] 2
- Without using R, calculate the variance of
a
.
## [1] 1
## [1] 1
- Without using R, what would be the fourth element of
v1+v2
?
# 9 since R recycled the shorter vector v2
# This is what happened: c(4,5,6,7) + c(10,2,10,2)
v1 <- c(4,5,6,7)
v2 <- c(10,2)
v1 + v2
## [1] 14 7 16 9
- Without using R, what would be the result?
# sum() requires a numeric vector
# for that, TRUE is converted into 1, and FALSE is converted into 0
# essentially, this expression count "the number of TRUEs'
sum(c(TRUE, TRUE, FALSE, TRUE, FALSE))
## [1] 3
- In
mtcars
data, what is the mean (rounded to 2 decimal) ofmpg
for cars with 6 cylinders?
## [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
## [16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
## [31] 15.0 21.4
## [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
## [1] TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE TRUE TRUE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
## [1] 21.0 21.0 21.4 18.1 19.2 17.8 19.7
## [1] 19.74286
## [1] 19.74286
- Given the 1000000 random numbers from a standard normal distribution (a normal distribution with mean = 0 and sd = 1), what would be the proportion of random numbers greater than 0?
# have you seen 1.96 before?
set.seed(777)
a <- rnorm(1000000) # equivalent to rnorm(1000, 0, 1)
length(a[a>0])/length(a)
## [1] 0.500623
## [1] 0.500623
- Given the 1000000 random numbers from a standard normal distribution (a normal distribution with mean = 0 and sd = 1), what would be the proportion of random numbers greater than 1.96?
# have you seen 1.96 before?
set.seed(777)
a <- rnorm(1000000) # equivalent to rnorm(1000, 0, 1)
length(a[a>1.96])/length(a)
## [1] 0.025198
## [1] 0.025198
- Without using R, what would be the first element of the following expression?
# %in% operator
# v1 %in% v2 returns a logical vector indicating
# whether the elements of v1 are included in v2.
c(1,2,3) %in% c(2,3,4,5,6)
## [1] FALSE TRUE TRUE
- Without using R, which element would be displayed first when displaying the level of the following factor?
# "medium" will be displayed first according to alphabetical order
# notice this order does not represent intrinsic order in factor
# this order only applies to display or sort
# we use ordered() to create ordered factor
a <- factor(c("high", "high", "medium", "low"))
a
## [1] high high medium low
## Levels: high low medium
- The following code will produce error (TRUE or FALSE)?