# Practice 4 Bar Plots with R

## 4.1 Directions

In this practice exercise, you will load data into R, and create a bar plot. Watch the VoiceThread and then work through the practice.

## 4.2 A closer look at the code

In this practice, we will be looking into a few slightly more advanced commands. We will use the mtcars data set to calculate average miles per gallon by the number of cylinders. Then we will make a bar plot of the averages.

### 4.2.1 Let’s make a simple bar plot

We are going to be working with the mtcars dataset to create a nice looking bar plot. In the code window below, I have the code necessary to make a simple bar plot. Press the Run button to see the plot.

To make this plot, we need to first create a table,

`table1 <- table(mtcars$cyl)`

then we use the `barplot()`

command to draw the bar plot,

`barplot(table1)`

Click the Run button below the code window to see the R output.

### 4.2.2 Let’s add a title and label the x-axis

The plot we just made is OK, but unless you know that this is the number of cars in the data set with 4, 6, and 8 cylinders, respectively, it does not tell you very much. So let’s add a title and label the x-axis.

To add a title, we add the argument `main`

to the`barplot()`

command. `main`

holds a string that will be used as the title of the plot (Note: a string is a variable that holds text).

`barplot(table1,main="Car Frequency by Number of Cylinders")`

Next, lets add a label to the x-axis. This is done with the `xlab`

argument.

`barplot(table1,main="Car Frequency by Number of Cylinders",xlab="Number of Cylinders")`

Click the Run button below the code window to see the R output.

### 4.2.3 Let’s calculate average miles per gallon by the number of cylinders

A frequency plot of cars by the number of cylinders is somewhat interesting. Still, it doesn’t tell us much about the relationship between variables in the data set. Let’s see how the number of cylinders impacts miles per gallon by calculating average miles per gallon for cars with 4, 6 and 8 cylinders.

To do this, we need the we need the `aggregate()`

command. The `aggregate()`

command combines observations by group using some function, such as `mean`

. So we can find the average `mpg`

for cars with 4, 6 and 8 cylinders by aggregating by `mtcars$cyl`

.

`aggregate(x=mtcars$mpg, by=list(mtcars$cyl), FUN=mean)`

The `aggregate()`

command has three arguments we need to be concerned about:

`x`

which is the data to aggregate.`by`

which is the variable indicating the groups. We use the`list`

command because`by`

argument requires a list.`FUN`

wish is the aggregation function, in this case`mean`

.

Click the Run button below the code window to see the R output. Notice that the output is not the same as the `table()`

command. For now, think of mpg.avg as a new data frame.

### 4.2.4 Let’s make a bar plot of average miles per gallon by the number of cylinders

Now let’s use what we have learned to make a super cool bar plot that shows how `mpg`

relates to the number cylinders in an engine.

`mpg.avg`

is not a table, so we need to tell R which variable should be used to set the height of each bar and which variable should be used to label the bars. We do this by passing `barplot()`

the arguments `height`

and `names.arg`

. The `height`

of the bars should be the average `mpg`

,

`height = mpg.avg[,2]`

and the names should be the number of cylinders,

`names.arg = mpg.avg[,1]`

Put it all together and we have,

`barplot(height = mpg.avg[,2], names.arg = mpg.avg[,1])`

### 4.2.5 Let’s make the bar plot better

Let’s fix the y-axis, we can do this by setting the `ylim`

argument. `ylim`

is a vector, list of numbers, with the first number being the lower limit for the y-axis and the second number being the upper limit for the y-axis. The combine function, `c()`

is used for this. Let’s set the upper and lower limits of the y-axis to 0 and 30, respectively.

`ylim = c(0,30)`

The final feature we can add to this plot is data labels. We do this with the `text()`

command. The first step is to store our plot in an object,

`mpg.avg.barplot <- barplot(height = mpg.avg$mpg_cyl ...`

Next we need to pass the plot, the data labels, and where we want the labels to appear to the `text()`

command.
The three arguments are:

`x`

in which we pass the plot object, i.e.`mpg.avg.barplot.`

`y`

in which we pass the how high on the plot we want the labels. We can use the height of each bar, i.e.`mpg.avg$mpg_cyl`

, plus a little to put the labels above the bars.`labels`

in which we pass the label for each bar.

The labels should be the average miles per gallon by the number of cylinders, but there are two problems: thes are numbers, not text, and they have a lot of digits past the decimal point. To solve this we use `round(mpg.avg$mpg_cyl,2)`

to round to the thousandths place and `as.character()`

to make R treat these numbers as text. So put it all together and we have

`mpg.avg.barplot <- text(x=mpg.avg.barplot, y=mpg.avg[,2]+2, labels=as.character(round(mpg.avg[,2],2)))`

WARNING! Keeping the parenthesis paired correctly in a compound statement like this can be difficult.

Now add in all the arguments we have discussed previously, and you get a pretty cool bar chart. Click the Run button below the code window to see the R output.

## 4.3 R code used in the VoiceThread

```
# Load the data
data("mtcars")
mtcars
data <-$cyl <- as.factor(data$cyl)
data
# Bar plot in base R
barplot( table(data$cyl) )
# Calculate avg mpg by number of cylinders
aggregate(x=mtcars$mpg, by=list(mtcars$cyl), FUN=mean)
mpg.avg <-
# Plot the data
barplot(height = mpg.avg[,2],
mpg.avg.barplot <-names.arg = mpg.avg[,1],
main="Average MPG by Number of Cylinders",
xlab="Number of Cylinders",
ylab="Average MPG",
ylim = c(0,30))
text(x=mpg.avg.barplot,
mpg.avg.barplot <-y=mpg.avg[,2]+2,
labels=as.character(round(mpg.avg[,2],2)))
mpg.avg.barplot
```

## 4.4 Now you try

Use R to complete the following activities (this is just for practice you do not need to turn anything in).

Use R to complete the following activities (this is just for practice you do not need to turn anything in).

- Use the mtcars dataset to calculate the average horsepower,
`hp`

, by number of cylinders. - Make a bar plot of the average horsepower by the number of cylinders.
- Add data labels to each bar in the plot.