Chapter 4 Reading Datasets

Hello!

This tutorial is for those who want to learn how to read .csv files into R, which is the most common filetype you’ll probably use.

csv stands for “comma separated values” file. It’s an uber-simple spreadsheet (Microsoft Word is to notepad as Excel is to csv). A lot of data you will use comes in as a csv or will be a file you want to transform into a csv.

To read a csv file into R, you will need (1) a csv file and (2) some knowledge about where your csv file and code are (they should be in the same folder). In this tutorial, we’ll work with some sample (“toy”) data: a sample of tweets talking about “Defund the Police” in June 2020.

First, let’s figure out what folder we are in! As we discussed in an earlier tutorial, we can use the getwd() function to check what directory (computer folder) you are in and setwd() to change your directory.

getwd()
## [1] "C:/Users/comm-lukito/Documents/J381M/2023_J381M_bookdown"

You want to make sure that the file you are trying to bring into R is in the same folder as the .rmd file. You can either move the csv file to the appropriate folder, or change your working directory to the folder where the .csv file is in. In this example, the file name (crowdtangle_barbenheimer_2023.csv) is in a folder called “data”.

Now that you’ve made sure your .csv file is in the working directory, we can now use the read.csv() function! We want to read the csv AND assign it to an object so that it is saved in our environment. We’ll call the object “social_media_file”

social_media_file <- read.csv("data/crowdtangle_barbenheimer_2023.csv")

You should now see an object social_media_file in your global environment now. The csv has 500 observations and 89 variables.

You can use head() and tail() to look at the top n rows and bottom n rows.

#head(social_media_file) #I "commented" these two lines so they would not run. To run them yourself, just remove the # at the front of the line
#tail(social_media_file)

tail(social_media_file, n = 3) #in this row, I am only looking at the first 3 rows
##                 ï..Page.Name    User.Name  Facebook.Id
## 16792 WKNO - 91.1 FM Memphis       WKNOFM 1.000463e+14
## 16793            KUNC 91.5fm      KUNC915 1.000641e+14
## 16794          Tagg Magazine taggmagazine 1.000635e+14
##                       Page.Category Page.Admin.Top.Country
## 16792 BROADCASTING_MEDIA_PRODUCTION                     US
## 16793            MEDIA_NEWS_COMPANY                     US
## 16794               TOPIC_PUBLISHER                     US
##                                                                                              Page.Description
## 16792 WKNO-FM is the Mid-South's source of NPR News, local information and classical music.\n\nWKNO-FM | WKNO
## 16793 KUNC brings news and storytelling to Coloradoâ\200\231s Front Range, mountain and Eastern Plains communities.
## 16794                                   For everyone lesbian, queer, and under the rainbow. Tagg...you're it!
##              Page.Created Likes.at.Posting Followers.at.Posting
## 16792 2008-07-29 18:46:35             5845                 6071
## 16793 2008-11-25 16:59:09             9762                10131
## 16794 2012-08-09 01:47:22             9410                 9751
##                  Post.Created Post.Created.Date Post.Created.Time Type
## 16792 2023-07-21 12:45:00 CDT        2023-07-21          12:45:00 Link
## 16793 2023-07-19 19:27:03 CDT        2023-07-19          19:27:03 Link
## 16794 2023-07-18 11:05:04 CDT        2023-07-18          11:05:04 Link
##       Total.Interactions Likes Comments Shares Love Wow Haha Sad Angry Care
## 16792                  0     0        0      0    0   0    0   0     0    0
## 16793                  0     0        0      0    0   0    0   0     0    0
## 16794                  0     0        0      0    0   0    0   0     0    0
##       Video.Share.Status Is.Video.Owner. Post.Views Total.Views
## 16792                                  -          0           0
## 16793                                  -          0           0
## 16794                                  -          0           0
##       Total.Views.For.All.Crossposts Video.Length
## 16792                              0          N/A
## 16793                              0          N/A
## 16794                              0          N/A
##                                                                  URL
## 16792 https://www.facebook.com/100046315040252/posts/874742670746226
## 16793 https://www.facebook.com/100064101066456/posts/683186593828037
## 16794 https://www.facebook.com/100063465255924/posts/759796136145888
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Message
## 16792                                                                                                                                                                                                                                                                                                                                                      After months of inescapable marketing, viral memes and crossover merch, two of the year's most anticipated movies hit theaters on Friday. Here's why so many people want to see both â\200” and how to prep.
## 16793 It started out as a meme mashing together two highly anticipated films being released on the same day, July 21: Greta Gerwig's "Barbie" and Christopher Nolan's "Oppenheimer". But the "Barbenheimer" is also a double-feature juggernaut that many people plan to see, even if it seems like an odd pairing. That's music to the ears of some theaters, like The Lyric in Fort Collins, which expects presales of "Barbie" alone to set an all-time record for the indie theater. KUNC's Natalie Skowlund and Alex Hager look at the Barbenheimer phenomenon.
## 16794                                                                                                                                                                                                                                                                                                                                                                                                                                          Barbenheimer: The Great Bisexual Reckoning A good problem to have and why you should be excited about it! Read more. 
##                                                                                              Link
## 16792 https://www.wknofm.org/2023-07-21/what-to-know-about-the-barbenheimer-double-feature-frenzy
## 16793                                                                     https://buff.ly/3Q1BQG0
## 16794                                                                      https://shar.es/afVNm1
##                                                                                                                    Final.Link
## 16792                                                                                                                        
## 16793 https://www.kunc.org/arts-life/2023-07-19/local-theaters-anticipate-post-covid-salvation-with-barbenheimer-dual-release
## 16794                                                                                  https://taggmagazine.com/barbenheimer/
##       Image.Text
## 16792           
## 16793           
## 16794           
##                                                                             Link.Text
## 16792                     What to know about the 'Barbenheimer' double feature frenzy
## 16793 Local theaters anticipate post-COVID salvation with 'Barbenheimer' dual release
## 16794                                      Barbenheimer: The Great Bisexual Reckoning
##                                                                                                                                                                                                                                                                        Description
## 16792                                                                    After months of inescapable marketing, viral memes and crossover merch, two of the year's most anticipated movies hit theaters on Friday. Here's why so many people want to see both â\200” and how to prep.
## 16793 A dual release of both Greta Gerwigâ\200\231s Barbie and Christopher Nolanâ\200\231s Oppenheimer on Friday have local theaters in Fort Collins and beyond hoping to bring moviegoers back into their seats. Will â\200\230Barbenheimer,â\200\231 as the double feature has been termed, be the cure?
## 16794                                                                                                                                                                                                                                    Barbie/Oppenheimer is giving us bi panic.
##       Sponsor.Id Sponsor.Name Sponsor.Category
## 16792         NA                              
## 16793         NA                              
## 16794         NA                              
##       Total.Interactions..weighted..â....Likes.1x.Shares.1x.Comments.1x.Love.1x.Wow.1x.Haha.1x.Sad.1x.Angry.1x.Care.1x..
## 16792                                                                                                                  0
## 16793                                                                                                                  0
## 16794                                                                                                                  0
##       Overperforming.Score
## 16792                  -28
## 16793                  -36
## 16794                  -38

With so many variables, this can be a little hard to read! But, you can also use View() to look at the dataset. Either type View(social_media_file) into the console or click the filename in the Global Enviornment.

View(social_media_file)

4.1 Variables

If you want to look at a variable, you can use a $operator. For example social_media_file$Message refers to the text variable (also known as the data “vector” or “field”). Let’s look at the first 6 rows of this column.

head(social_media_file$Message)
## [1] "Check out our Emo spotify playlist! http://bit.ly/EmoNeverDies:=:https://open.spotify.com/playlist/3czha2tGePoOcVI4xwa3j3"                                                                                                                   
## [2] "¡QUIERE SER KEN! 🩷 Cillian Murphy (Oppenheimer) comentó que esta abierto a interpretar a Ken en una futura secuela de 'Barbie' 🩷"                                                                                                    
## [3] "The true Barbenheimer experience"                                                                                                                                                                                                            
## [4] "¡BARBENHEIMER ES REAL! En un cine de Estados Unidos durante una función de 'Oppenheimer' hubo un fallo en el proyector y en los últimos 20 minutos la mitad de la pantalla se volvió rosa ðŸ\230† El Barbie x Oppenheimer se volvió canon."
## [5] "Barbenheimer สมัยตอà¸\231อยูà¹\210 Gotham ðŸ\230…"                                                                                                                                                                                  
## [6] "Barbenheimer 2023. ðŸ¤\235ðŸ\217¼ 🎨: @justralphy"

How would you look at the first 10 rows in the text variable?