6.5 Lab: Twitter API
Below we will…
- …download data from the Twitter Academic API.
- …create a simple graph visualizing the number of tweets accross time.
Please refer to the previous chapter for authentication. We start by loading the relevant packages:
# install.packages('pacman')
library(pacman)
p_load('academictwitteR', 'tidyverse', 'lubridate')
Given the current events we are interested in tweets that focus on Russia and on the month of February. First, we count how many tweets contain a reference to Russia.
# Count tweets
<- count_all_tweets(query = "Russia", # lang:DE
data_tweet_count start_tweets = "2022-02-01T00:00:00Z",
end_tweets = "2022-02-28T00:00:00Z",
granularity = "day",
n = 28) # same as number of days
head(data_tweet_count) # the data is in reverse chronological order
Let’s visualize the number of tweets across time.
ggplot(data = data_tweet_count,
aes(x = as_date(start),
y = tweet_count)) +
geom_point() +
geom_line() +
geom_vline(xintercept = as_date("2022-02-24"),
color = "red", lty = 2) +
scale_x_date(date_breaks = "1 day",
date_labels = "%d %b %Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
We can also count on an hourly basis.
# Count tweets
<- count_all_tweets(query = "Russia", # lang:DE
data_tweet_count start_tweets = "2022-02-23T00:00:00Z",
end_tweets = "2022-02-24T23:50:00Z",
granularity = "hour",
n = 48) # same as number of days
head(data_tweet_count) # the data is in reverse chronological order
Let’s visualize the number of tweets across time.
ggplot(data = data_tweet_count,
aes(x = ymd_hms(start),
y = tweet_count)) +
geom_point() +
geom_line() +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1)) +
xlab("Time") +
ylab("Number of tweets")
Then we can start collecting tweets. We collect 2000 tweets that contain the word Russia (it starts with the most recent tweet when we give a time period). Subsequently, we could analyze these tweets in some way.
# Search for tweets containing PUTIN within time period
<- get_all_tweets(query = "Russia",
data start_tweets = "2022-02-24T02:00:00Z",
end_tweets = "2022-02-24T04:00:00Z",
n = 5000)
# Check out which date those tweets were posted
table(data$created_at)
Often we may be interested in particular Twitter users, e.g., elites such as politicians or journalists.
Below we collect recent tweets of the German foreign minister Bärbock and have a look at the profile (you can do the same for several IDs).
# Get ID number of twitter account
# The ID stays the same even if account name changes
get_user_id("ABaerbock")
# Get timeline for that ID
<- get_user_timeline("1469264387512979461",
data_baerbock start_tweets = "2022-02-01T00:00:00Z",
end_tweets = "2022-02-28T00:00:00Z",
n = 200)
# Get the profile ()
<- get_user_profile("1469264387512979461") data_profile