6.5 Lab: Twitter API

Below we will…

  • …download data from the Twitter Academic API.
  • …create a simple graph visualizing the number of tweets accross time.

Please refer to the previous chapter for authentication. We start by loading the relevant packages:

# install.packages('pacman')
p_load('academictwitteR', 'tidyverse', 'lubridate')

Given the current events we are interested in tweets that focus on Russia and on the month of February. First, we count how many tweets contain a reference to Russia.

# Count tweets
data_tweet_count <- count_all_tweets(query = "Russia", #  lang:DE
                           start_tweets = "2022-02-01T00:00:00Z",
                           end_tweets = "2022-02-28T00:00:00Z",
                           granularity = "day",
                           n = 28) # same as number of days
head(data_tweet_count) # the data is in reverse chronological order

Let’s visualize the number of tweets across time.

ggplot(data = data_tweet_count,
       aes(x = as_date(start),
           y = tweet_count)) +
  geom_point() +
  geom_line() + 
  geom_vline(xintercept = as_date("2022-02-24"), 
             color = "red", lty = 2) +
  scale_x_date(date_breaks = "1 day", 
               date_labels =  "%d %b %Y") +
  theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))

We can also count on an hourly basis.

# Count tweets
data_tweet_count <- count_all_tweets(query = "Russia", #  lang:DE
                           start_tweets = "2022-02-23T00:00:00Z",
                           end_tweets = "2022-02-24T23:50:00Z",
                           granularity = "hour",
                           n = 48) # same as number of days
head(data_tweet_count) # the data is in reverse chronological order

Let’s visualize the number of tweets across time.

ggplot(data = data_tweet_count,
       aes(x = ymd_hms(start),
           y = tweet_count)) +
  geom_point() +
  geom_line() + 
  theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1)) +
  xlab("Time") +
  ylab("Number of tweets")

Then we can start collecting tweets. We collect 2000 tweets that contain the word Russia (it starts with the most recent tweet when we give a time period). Subsequently, we could analyze these tweets in some way.

# Search for tweets containing PUTIN within time period
data <- get_all_tweets(query = "Russia", 
               start_tweets = "2022-02-24T02:00:00Z", 
               end_tweets = "2022-02-24T04:00:00Z", 
               n = 5000)

# Check out which date those tweets were posted

Often we may be interested in particular Twitter users, e.g., elites such as politicians or journalists.

Below we collect recent tweets of the German foreign minister Bärbock and have a look at the profile (you can do the same for several IDs).

# Get ID number of twitter account
# The ID stays the same even if account name changes

# Get timeline for that ID
data_baerbock <- get_user_timeline("1469264387512979461",
                  start_tweets = "2022-02-01T00:00:00Z", 
                  end_tweets = "2022-02-28T00:00:00Z",
                  n = 200)

# Get the profile ()
data_profile <- get_user_profile("1469264387512979461")