6.5 Lab: Twitter API

Below we will…

  • …download data from the Twitter Academic API.
  • …create a simple graph visualizing the number of tweets accross time.

Please refer to the previous chapter for authentication. We start by loading the relevant packages:

# install.packages('pacman')
library(pacman)
p_load('academictwitteR', 'tidyverse', 'lubridate')

Given the current events we are interested in tweets that focus on Russia and on the month of February. First, we count how many tweets contain a reference to Russia.

# Count tweets
data_tweet_count <- count_all_tweets(query = "Russia", #  lang:DE
                           start_tweets = "2022-02-01T00:00:00Z",
                           end_tweets = "2022-02-28T00:00:00Z",
                           granularity = "day",
                           n = 28) # same as number of days
head(data_tweet_count) # the data is in reverse chronological order

Let’s visualize the number of tweets across time.

ggplot(data = data_tweet_count,
       aes(x = as_date(start),
           y = tweet_count)) +
  geom_point() +
  geom_line() + 
  geom_vline(xintercept = as_date("2022-02-24"), 
             color = "red", lty = 2) +
  scale_x_date(date_breaks = "1 day", 
               date_labels =  "%d %b %Y") +
  theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))

We can also count on an hourly basis.

# Count tweets
data_tweet_count <- count_all_tweets(query = "Russia", #  lang:DE
                           start_tweets = "2022-02-23T00:00:00Z",
                           end_tweets = "2022-02-24T23:50:00Z",
                           granularity = "hour",
                           n = 48) # same as number of days
head(data_tweet_count) # the data is in reverse chronological order

Let’s visualize the number of tweets across time.

ggplot(data = data_tweet_count,
       aes(x = ymd_hms(start),
           y = tweet_count)) +
  geom_point() +
  geom_line() + 
  theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1)) +
  xlab("Time") +
  ylab("Number of tweets")

Then we can start collecting tweets. We collect 2000 tweets that contain the word Russia (it starts with the most recent tweet when we give a time period). Subsequently, we could analyze these tweets in some way.

# Search for tweets containing PUTIN within time period
data <- get_all_tweets(query = "Russia", 
               start_tweets = "2022-02-24T02:00:00Z", 
               end_tweets = "2022-02-24T04:00:00Z", 
               n = 5000)

# Check out which date those tweets were posted
table(data$created_at)

Often we may be interested in particular Twitter users, e.g., elites such as politicians or journalists.

Below we collect recent tweets of the German foreign minister Bärbock and have a look at the profile (you can do the same for several IDs).

# Get ID number of twitter account
# The ID stays the same even if account name changes
get_user_id("ABaerbock")

# Get timeline for that ID
data_baerbock <- get_user_timeline("1469264387512979461",
                  start_tweets = "2022-02-01T00:00:00Z", 
                  end_tweets = "2022-02-28T00:00:00Z",
                  n = 200)

# Get the profile ()
data_profile <- get_user_profile("1469264387512979461")