3.6 Troubleshooting Error Messages

In this section, we will go over how to troubleshoot errors – the most time consuming part of learning and coding in R. Some of the examples will contain functions that we will review in detail further into this guide. For now, we do not need to know what these functions do just yet. Follow along the examples by executing the code in each example.

3.6.1 Common Mistakes

What probably happened is that your problem occurred due to a very simple mistype or misunderstanding (read: user error ☹).

Here are some of the most common mistakes that even seasoned R users make:

1. Capitalization

You typed an uppercase letter when you should have typed a lowercase letter (vice versa). Remember that capitalization matters!! Library() and library() are not the same. Library() will result in an error code.

2. Mis-spelling

Unlike Microsoft Office, R will not highlight misspelled words. R will, however, accept both British and American spellings (i.e., color/colour, summarize/summarise, gray/grey)

3. Closing Punctuation

You forgot a closing parentheses, bracket, or quotation. All too often have I forgotten to add an additional parenthesis at the end of a line. You’ll know that you’ve done this if you see a red X on the left side. The red X will appear as you are typing, so wait until you’re finished to assess these warnings. Be wary of copying/pasting from different applications (from your internet browser, Microsoft Office, etc.) Occasionally, a copied quotation symbol will not be read properly by R.

4. Continuing Punctuation

You forgot to add a comma (,) or pipe (%>%) after your phrase before moving on to the next indented line of continued code. Indenting your code provides better readability for you (and others), but it can be easy to miss a comma or pipe:

Missing Punctuation

diamonds %>% 
  mutate(price200 = price - 200
         price300 = price - 300) # missing pipe
  select(cut, price) %>%
  arrange(price)

Corrected Punctuation

diamonds %>% 
  mutate(price200 = price - 200,
         price300 = price - 300) %>% # added pipe
  select(cut, price) %>%
  arrange(price)

5. Conflicting code

Perhaps you accidentally redefined an object when you didn’t mean to. Try clearing the environment (click on the broomstick) and running a few lines of code at a time to see where the problem is.

Clearing the environment history.

Figure 3.27: Clearing the environment history.

6. Libraries Are Not Loaded

It is important to make sure that all of the pertinent libraries have been loaded during your R session. Each time you exit out of RStudio, the libraries are unloaded. Occasionally, R will spontaneously unload your libraries during your session as well, so if an error message reads that a function is not found, the usual solution is to reload your libraries.

7. The Unsaved Object

As a refresher, the <- symbol represents “defines”. The syntax is typically structured as such:

name.of.object <- "the definition of the object goes here"

It is often wise to “test” the definition before officially defining it as an object. That is, you should know what the definition is before you define it as an object. What if you accidentally defined an object poorly? If you later use this ill-defined object, additional errors are sure to follow. Let’s see an example:

Let’s say I want to define a new object called one.to.five that is defined as a list of numbers including: 1, 2, 3, 4, & 5. Before officially defining this object, I’m going to type out the definition and execute it (in the script).

Here, we see that I’ve concatenated (hence the c()) the numbers 1 through 5. I’ve typed this in the script.

c(1,2,3,4,5)

In the console, I see my executed code (in purple) and the output (in white) that the code produces (Figure 3.28):

Concatenating 1,2,3,4,5.

Figure 3.28: Concatenating 1,2,3,4,5.

Notice that the environment has not added a new object – it’s still empty! (Figure 3.29)

Empty environment.

Figure 3.29: Empty environment.

Now, if I deem that this definition for my new object is correct, I will go ahead and name this as an official object. Notice that what I’ve done is provide the object name, the definition symbol (<-), and then the definition; see Figure ??:

one.to.five <- c(1,2,3,4,5)

Unlike executing the definition alone, executing code that defines an object will not display the object’s definition (i.e., there’s no output that follows the executed code); see Figure 3.30:

Saving an object.

Figure 3.30: Saving an object.

In order to see the definition/contents of the object, you must execute the object name by either typing out the name again and executing this code, or by highlighting the name in the script and executing the code (Figure ??):

3.6.2 One Line at a Time

A good rule of thumb is to run a few lines at a time to identify the problematic code. The example below utilizes coding techniques that will be introduced much later in the guide. Here, it is important that you grasp the concept of how to run one line at a time, not so much the code content.

If executing the entire code below produces an error…

diamonds %>% 
  mutate(price200 = price - 200, 
         price20perc = price * .80) %>%
  group_by(cut, color) %>% 
  summarize(total = count(), 
            m1 = mean(price), 
            m2 = mean(price200), 
            m3 = mean(price20perc)) %>% 
  ungroup()

I will highlight and execute one portion of the code at a time to detect the location of the error:

  1. Execute
diamonds
  1. Execute
diamonds %>% 
  mutate(price200 = price - 200, 
         price20perc = price * .80)
  1. Execute
diamonds %>% 
  mutate(price200 = price - 200, 
         price20perc = price * .80) %>% 
  group_by(cut, color)
  1. Execute
diamonds %>% 
  mutate(price200 = price - 200, 
         price20perc = price * .80) %>% 
  group_by(cut, color) %>% 
  summarize(total = count(), 
            m1 = mean(price),
            m2 = mean(price200), 
            m3 = mean(price20perc))
  1. Finally, execute the entire code again if no error has occurred

Typically, code in the lower lines rely on code from the top lines. That is, you cannot highlight lines 5-8 in the example without also highlighting lines 1-4 because the latter uses information from the former.

Notice that I only highlight up until (excluding) the next pipe (%>%) (Figure ??). Recall that the pipe symbol represents a continuation. If you conclude on a pipe, R will think that you are not finished typing. The pipe can conceptualized as the phrase “and then…”:

“I am going to select diamonds as my dataset and then I’m going to mutate it and then I’m going to group the data by cut and color and then I’m going to summarize a few things and then I’m going to ungroup the data.”

In the above code, you’ll find that the error occurs when the summarize() function is executed (lines 1-8), but not before. I now know that the problem lies somewhere in line 5 and can troubleshoot further. The same concept applies for plus signs (+) – do not end on a plus sign!

If you have accidentally executed code concluding on a pipe or a plus sign, simply click in the console region (bottom left panel in RStudio) and hit the esc button. R will reset your console, indicated by a new, empty line (>) and you can try executing code again.

3.6.3 Problems with Package Loading

Sometimes, for inexplicable reasons, a package will uninstall spontaneously. Perhaps it’s because the package needs to be updated (newer versions have come out). Perhaps the R goblin stole it. I don’t have all the answers. R will sometimes produce helpful error messages to let you know that your packages are the problem:

Error: package or namespace load failed for ‘PACKAGE.NAME.HERE’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘PACKAGE.NAME.HERE’ In addition: Warning message: package ‘PACKAGE.NAME.HERE’ was built under R version 3.5.3

Notice how R told us that PACKAGE.NAME.HERE was the specific package that had a problem? (Obviously PACKAGE.NAME.HERE is not a real R package, it’s just a placeholder name for this example).

The solution may be to:

  1. Try loading the package with library() again. If the error message states that the package doesn’t exist again, manually install the package with install.packages("PACKAGE.NAME.HERE"), replacing PACKAGE.NAME.HERE with the actualy package name. Then, load the package with library() again.

  2. Reinstall/update your base R.

    • There are a few ways to do this, but the most straightforward way to update your R version is through the CRAN website (see page 6 of this guide). Make sure that all of your R-related windows are closed as you are downloading the new R version.

    • After installing the new version, RStudio should indicate that the version has changed in the console. You can also execute get getRversion(). After the installation of a new R version, you must reinstall all of your packages with install.package().

3.6.4 Using the Internet to Your Advantage

My best advice for when you encounter error messages is to Google it. I would recommend for you to post the question on Stacked Overflow (an online forum for the community to post/answer coding problems).

Everybody – beginner and expert alike – will Google how to code something at some point. The real skill that experts have over beginners is that experts know how to strategically Google. For that, you want to master these skills:

  • Make sure that your question is specific and reproducible.

    • I recommend using a built-in dataset to illustrate your problem.

    • Don’t forget to include “R” as a keyword during the search.

  • Make sure you understand the terminology well.

    • Because this book is designed with the tidyverse package in mind, I recommend adding phrases such as “with dplyr” or “with tidyverse” in your search.

    • dplyr is a package inside tidyverse that provides “tidyverse-style” coding

For example, let’s say that I want to know how to rename a column in my dataset. I could Google: “How to rename a column in R with dplyr/tidyverse” and read the answers posted in Stacked Overflow (www.stackoverflow.com).

Notice how I covered the following in my google search:

  • The specific action (how to rename a column)

  • The programming language (R statistics)

  • The specific style/technique for coding (dplyr or tidyverse package)

When reading the answers to these questions posted on Stacked Overflow, consider the parent question that the original poster asked. A good Stacked Overflow question contains a reproducible example for which you can directly follow along with to see if the answers listed make sense. Not all answers will be useful and not all users will utilize “tidyverse-style” coding. If you cannot find the answer to your question, you may choose to post it yourself on Stacked Overflow. Just be sure to follow the guidelines mentioned above for what to include in your question!

3.6.5 When Google Fails You

Sometimes you just can’t find an answer.

In rare (but real) circumstances, I have found that RStudio can run into some issues that have nothing to do with your code. If your code is running strangely, the age-old trick of turning it off and on again (i.e., RStudio or your computer) occasionally does work. Make sure you save the information you need to save before you close RStudio or restart your computer. Upon reopening RStudio, make sure the Global Environment is clean before attempting to execute code.

Another option is to take a break. During that period, little elves come and fix your computer for you. If you don’t leave, they’ll never show up. In all seriousness, error messages can be anger-inducing and in your cloud of frustration, you may not be able to think clearly. Step away for a bit and look at it from a different emotional state.

Finally, ask someone for help. An experienced R user should be able to figure out what went wrong pretty quickly. This is because experts have made (and hopefully fixed) 100X more errors than beginners; many people struggle with the same problems at the beginning. Problem solving is (in my opinion) the best way to learn. However, it’s also the most time consuming. Ask for help when you need it. Time is an invaluable, nonrenewable resource.