Chapter 4 Graphing / Formatting Graphs

4.1 Box Plot

library(ggplot2)

ggplot(dataset, aes(x = Variable1, y = Variable2)) +
  geom_boxplot(outlier.colour = "blue", outlier.shape = 8, outlier.size = 2) +
labs(title = "Title", x = "X Label", y = "Y Label")
  • Replace data with the name of your data set.
  • Replace Variable1 with the column name of your x variable.
  • Replace Variable2 with the column name of your y variable.
  • Add your labels!

4.2 Histogram

library(ggplot2)

ggplot( data, aes(x = )) + geom_histogram( )
  • Replace data with the name of your data set.
  • Fill in x = with the column name of your x variable.

4.3 Bar Plot

library(ggplot2)

ggplot( data, aes(x = )) + geom_bar( )
  • Replace data with the name of your data set.
  • Fill in x = with the column name of your x variable.

4.4 Scatter Plot

library(ggplot2)

ggplot( data, aes(x = , y = )) + geom_point( )
  • Replace data with the name of your data set.
  • Fill in x = with the column name of your x variable.
  • Fill in y = with the column name of your y variable.

4.5 Stacking Graphs

  • Example:
    • g1 <- ggplot(cars, aes(x = speed)) + geom_histogram(bins = 10)
    • g2 <- ggplot(cars, aes(x=dist)) + geom_histogram(bins=10)
    • gridExtra::grid.arrange(g1,g2, g1,g2, ncol = 2)

Just change the ncol = to match the number of columns you want in your output.

  • If you want two graphs side by side, use ncol = 2.

  • If you want them one on top of the other, use ncol = 1.

4.6 Shrinking Graphs

Sometimes, plots take up a lot of space on a page. To change that, you can shrink graphs.

To do this, change your code chunk to: {r , fig.asp = .6}

You can change the .6 to any number between .1 and .9.

Make sure the graph can still be read after shrinking!!

4.7 Formatting a Table

A lot of times in R, you have loose output like MSE values or AIC values, or predictions, etc. Instead of having them just printed out with no idea of what they correspond to, it is better to put them in a table.

If you are using the table( ) command in R, this works very well, but can give output that is not formatted nicely. To fix that, use

  • knitr::kable( )

Inside the ( ) put your table, for instance

  • knitr::kable( table(cars$speed) )

4.8 Formatting the Output from a Regression Model

The default output we get from using the summary command in R is useful, but not very pretty. To format the output, you can use the following (just replace model1 with the name of your model).

  • Example: knitr::kable( summary(model1)$coefficients)