Chapter 5 Reading Datasets

Hello!

This tutorial is for those who want to learn how to read .csv files into R, which is the most common filetype you’ll probably use.

csv stands for “comma separated values” file. It’s an uber-simple spreadsheet (Microsoft Word is to notepad as Excel is to csv). A lot of data you will use comes in as a csv or will be a file you want to transform into a csv.

To read a csv file into R, you will need (1) a csv file and (2) some knowledge about where your csv file and code are (they should be in the same folder). In this tutorial, we’ll work with some sample (“toy”) data: a sample of tweets talking about “Defund the Police” in June 2020.

First, let’s figure out what folder we are in! As we discussed in an earlier tutorial, we can use the getwd() function to check what directory (computer folder) you are in and setwd() to change your directory.

getwd()
## [1] "C:/Users/comm-lukito/Documents/J381M/j381m_bookdown"

You want to make sure that the file you are trying to bring into R is in the same folder as the .rmd file. You can either move the csv file to the appropriate folder, or change your working directory to the folder where the .csv file is in. In this example, the file name (crowdtangle_barbenheimer_2023.csv) is in a folder called “data”.

Now that you’ve made sure your .csv file is in the working directory, we can now use the read.csv() function! We want to read the csv AND assign it to an object so that it is saved in our environment. We’ll call the object “social_media_file”

social_media_file <- read.csv("data/crowdtangle_barbenheimer_2023.csv")

You should now see an object social_media_file in your global environment now. The csv has 500 observations and 89 variables.

You can use head() and tail() to look at the top n rows and bottom n rows.

#head(social_media_file) #I "commented" these two lines so they would not run. To run them yourself, just remove the # at the front of the line
#tail(social_media_file)

tail(social_media_file, n = 3) #in this row, I am only looking at the first 3 rows
##                X...Page.Name    User.Name     Facebook.Id                 Page.Category Page.Admin.Top.Country
## 16792 WKNO - 91.1 FM Memphis       WKNOFM 100046315040252 BROADCASTING_MEDIA_PRODUCTION                     US
## 16793            KUNC 91.5fm      KUNC915 100064101066456            MEDIA_NEWS_COMPANY                     US
## 16794          Tagg Magazine taggmagazine 100063465255924               TOPIC_PUBLISHER                     US
##                                                                                              Page.Description        Page.Created Likes.at.Posting
## 16792 WKNO-FM is the Mid-South's source of NPR News, local information and classical music.\n\nWKNO-FM | WKNO 2008-07-29 18:46:35             5845
## 16793 KUNC brings news and storytelling to Colorado\342\200\231s Front Range, mountain and Eastern Plains communities. 2008-11-25 16:59:09             9762
## 16794                                   For everyone lesbian, queer, and under the rainbow. Tagg...you're it! 2012-08-09 01:47:22             9410
##       Followers.at.Posting            Post.Created Post.Created.Date Post.Created.Time Type Total.Interactions Likes Comments Shares Love Wow Haha Sad Angry Care
## 16792                 6071 2023-07-21 12:45:00 CDT        2023-07-21          12:45:00 Link                  0     0        0      0    0   0    0   0     0    0
## 16793                10131 2023-07-19 19:27:03 CDT        2023-07-19          19:27:03 Link                  0     0        0      0    0   0    0   0     0    0
## 16794                 9751 2023-07-18 11:05:04 CDT        2023-07-18          11:05:04 Link                  0     0        0      0    0   0    0   0     0    0
##       Video.Share.Status Is.Video.Owner. Post.Views Total.Views Total.Views.For.All.Crossposts Video.Length
## 16792                                  -          0           0                              0          N/A
## 16793                                  -          0           0                              0          N/A
## 16794                                  -          0           0                              0          N/A
##                                                                  URL
## 16792 https://www.facebook.com/100046315040252/posts/874742670746226
## 16793 https://www.facebook.com/100064101066456/posts/683186593828037
## 16794 https://www.facebook.com/100063465255924/posts/759796136145888
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Message
## 16792                                                                                                                                                                                                                                                                                                                                                      After months of inescapable marketing, viral memes and crossover merch, two of the year's most anticipated movies hit theaters on Friday. Here's why so many people want to see both \342\200\224 and how to prep.
## 16793 It started out as a meme mashing together two highly anticipated films being released on the same day, July 21: Greta Gerwig's "Barbie" and Christopher Nolan's "Oppenheimer". But the "Barbenheimer" is also a double-feature juggernaut that many people plan to see, even if it seems like an odd pairing. That's music to the ears of some theaters, like The Lyric in Fort Collins, which expects presales of "Barbie" alone to set an all-time record for the indie theater. KUNC's Natalie Skowlund and Alex Hager look at the Barbenheimer phenomenon.
## 16794                                                                                                                                                                                                                                                                                                                                                                                                                                          Barbenheimer: The Great Bisexual Reckoning A good problem to have and why you should be excited about it! Read more. 
##                                                                                              Link
## 16792 https://www.wknofm.org/2023-07-21/what-to-know-about-the-barbenheimer-double-feature-frenzy
## 16793                                                                     https://buff.ly/3Q1BQG0
## 16794                                                                      https://shar.es/afVNm1
##                                                                                                                    Final.Link Image.Text
## 16792                                                                                                                                   
## 16793 https://www.kunc.org/arts-life/2023-07-19/local-theaters-anticipate-post-covid-salvation-with-barbenheimer-dual-release           
## 16794                                                                                  https://taggmagazine.com/barbenheimer/           
##                                                                             Link.Text
## 16792                     What to know about the 'Barbenheimer' double feature frenzy
## 16793 Local theaters anticipate post-COVID salvation with 'Barbenheimer' dual release
## 16794                                      Barbenheimer: The Great Bisexual Reckoning
##                                                                                                                                                                                                                                                                        Description
## 16792                                                                    After months of inescapable marketing, viral memes and crossover merch, two of the year's most anticipated movies hit theaters on Friday. Here's why so many people want to see both \342\200\224 and how to prep.
## 16793 A dual release of both Greta Gerwig\342\200\231s Barbie and Christopher Nolan\342\200\231s Oppenheimer on Friday have local theaters in Fort Collins and beyond hoping to bring moviegoers back into their seats. Will \342\200\230Barbenheimer,\342\200\231 as the double feature has been termed, be the cure?
## 16794                                                                                                                                                                                                                                    Barbie/Oppenheimer is giving us bi panic.
##       Sponsor.Id Sponsor.Name Sponsor.Category Total.Interactions..weighted.......Likes.1x.Shares.1x.Comments.1x.Love.1x.Wow.1x.Haha.1x.Sad.1x.Angry.1x.Care.1x..
## 16792         NA                                                                                                                                                0
## 16793         NA                                                                                                                                                0
## 16794         NA                                                                                                                                                0
##       Overperforming.Score
## 16792                  -28
## 16793                  -36
## 16794                  -38

With so many variables, this can be a little hard to read! But, you can also use View() to look at the dataset. Either type View(social_media_file) into the console or click the filename in the Global Enviornment.

View(social_media_file)

5.1 Variables

If you want to look at a variable, you can use a $operator. For example social_media_file$Message refers to the text variable (also known as the data “vector” or “field”). Let’s look at the first 6 rows of this column.

head(social_media_file$Message)
## [1] "Check out our Emo spotify playlist! http://bit.ly/EmoNeverDies:=:https://open.spotify.com/playlist/3czha2tGePoOcVI4xwa3j3"                                                                                                                   
## [2] "\302\241QUIERE SER KEN! \360\237\251\267 Cillian Murphy (Oppenheimer) coment\303\263 que esta abierto a interpretar a Ken en una futura secuela de 'Barbie' \360\237\251\267"                                                                                                    
## [3] "The true Barbenheimer experience"                                                                                                                                                                                                            
## [4] "\302\241BARBENHEIMER ES REAL! En un cine de Estados Unidos durante una funci\303\263n de 'Oppenheimer' hubo un fallo en el proyector y en los \303\272ltimos 20 minutos la mitad de la pantalla se volvi\303\263 rosa \360\237\230\206 El Barbie x Oppenheimer se volvi\303\263 canon."
## [5] "Barbenheimer \340\270\252\340\270\241\340\270\261\340\270\242\340\270\225\340\270\255\340\270\231\340\270\255\340\270\242\340\270\271\340\271\210 Gotham \360\237\230\205"                                                                                                                                                                                  
## [6] "Barbenheimer 2023. \360\237\244\235\360\237\217\274 \360\237\216\250: @justralphy"

How would you look at the first 10 rows in the text variable?