Chapter 20 Spotify API

Johanna Mehltretter, Marie-Lou Sohnius

You will need to install the following packages for this chapter (run the code):

# install.packages('pacman')
library(pacman)
pacman::p_load('spotifyr',  # To access the API
               'tidyverse', # Data wrangling and plots
               'plotly',    # Interactive plots
               'ggimage',   # Adding album covers to ggplot
               'kableExtra',# Format tables
               'httpuv',    # To be able to access the Spotify URL
               'httr')      # In case you want to access the API w/o
                            # the package

20.1 Provided services/data

  • What data/service is provided by the API?

The Spotify Web API allows you to pull data from the platform on listed artists, albums, tracks, and playlists. Possible requests include getting information on track audio features (e.g., danceability, score, or pace) as well as popularity metrics of single tracks and albums. Beyond these general query options, you can also collect data on individual users’ (including your own) listening behavior. Accessing personal information, however, depends on users’ consent.

20.2 Prerequisites

  • What are the prerequisites to access the API (authentication)?

20.2.1 Authentication

To access the Spotify API, you need to have a Spotify account. Don’t have one yet? Then sign up for free here! It does not matter whether you have a Premium account or not. Once you’re ready to use your Spotify account, you can set up a developer account to access the Spotify Web API. With the developer access, you can create new integrations and manage your Spotify credentials.

Once you have a developer account, you will need to create an app on the dashboard page.

Figure 1: Create the app on the dashboard (Screenshot from the Spotify for Developers page)
Figure 1: Create the app on the dashboard (Screenshot from the Spotify for Developers page)

20.2.2 Authorization code flow

If you want to access information on your own account (e.g. your favorite artists, your playlists), you need to complete one additional step: Go to the app you created in the Spotify API dashboard, go to edit settings and add a Redirect URI. A recommended option for your Redirect URI is http://localhost:1410/. See Spotify’s developer guide for more information.

Figure 2: Open settings of the app (Screenshot from the Spotify for Developers page)
Figure 2: Open settings of the app (Screenshot from the Spotify for Developers page)
Figure 3: Add the Redirect URI (http://localhost:1410/) to the settings (Screenshot from the Spotify for Developers page)
Figure 3: Add the Redirect URI (http://localhost:1410/) to the settings (Screenshot from the Spotify for Developers page)

20.2.3 Savely storing your credentials in the R environment

Via the app you created, you receive your client ID and client secret. You can find them on the app page in your Spotify developer account. Save both credentials in the R environment. For accessing your personal data, also add the redirect URI (the same one you added in your app’s settings!):

# Here you can store the credentials as follows:

# Sys.setenv(SPOTIFY_CLIENT_ID="xxx") #
# Sys.setenv(SPOTIFY_CLIENT_SECRET="xxx") #
# Sys.setenv(SPOTIFY_REDIRECT_URI="http://localhost:1410/") #

# Beware: The credential locals are case-sensitive, thus must be stored
# exactly as above to work correctly with the spotifyr package.

 access_token <- get_spotify_access_token() # Stores your client ID in a 
 #local object

20.3 Simple API call

  • What does a simple API call look like?

In this chapter, we solely focus on how to make API calls via the corresponding spotifyr package. There also exists the option to access the Spotify API using the more common httr package. The latter option is more complicated than using the customized spotifyr package, though, as the manual authentication process is more difficult to implement. If you are interested in making API calls without the package, have a look at this detailed guide by Ray Heberer.

20.4 API access in R

  • How can we access the API from R (httr + other packages)?

Instead of typing the API request into our browser or using the httr package, we can use the spotifyr package to easily access the API from R.

Note: For information on all possible queries available through the spotifyr wrapper, see the online documentation or this detailed introduction to the package which also informed this chapter. You can also check the R built-in help function on the package:

library(spotifyr)
?spotifyr

20.4.1 Playlist features

So let’s submit our first query to the API! In this example, we are interested in the features of current global top 50 tracks. To get this information, we first retrieve the Spotify playlist ID by opening the playlist we want to analyze in the browser and then copying the id part from the link. In our example of the Spotify global top 50 playlist:

Playlist link: https://open.spotify.com/playlist/37i9dQZEVXbMDoHDwVN2tF

Playlist ID: 37i9dQZEVXbMDoHDwVN2tF

Now that we have the ID, we can retrieve all information on tracks in the playlist by calling the function get_playlist_audio_features.

# Store the data in a dataframe
top50 <- get_playlist_audio_features(playlist_uris = '37i9dQZEVXbMDoHDwVN2tF') 
# Global Top 50


# Add the tracks' rank to the dataset:
# the data comes sorted as listed in the playlist but does not contain a
# specific variable indicating the rank. Therefore, we create a new 
# variable that contains the rank in ascending order, ranging from 1 to
# 50.

top50$rank <- seq.int(nrow(top50)) 


# So far, so good. Looking at the data, artist names are currently stored
# in lists.
# The next snippet moves artist names into a new variable for easier
# access. Also, we add the album cover link to a new variable image to
# plot the covers later.

for (i in 1:50) {
  top50$artist[i] <- top50[[28]][[i]]$name
  top50$image[i] <- c(top50[[49]][[i]]$url[2], size=10, replace = TRUE)
}
# Now that we have the data set ready to go, let's take a look at what
# variables are in there.
names(top50) %>% 
  kbl() %>% 
  kable_styling(bootstrap_options = c("hover")) %>% 
  scroll_box(width = "100%", height = "300px")
x
playlist_id
playlist_name
playlist_img
playlist_owner_name
playlist_owner_id
danceability
energy
key
loudness
mode
speechiness
acousticness
instrumentalness
liveness
valence
tempo
track.id
analysis_url
time_signature
added_at
is_local
primary_color
added_by.href
added_by.id
added_by.type
added_by.uri
added_by.external_urls.spotify
track.artists
track.available_markets
track.disc_number
track.duration_ms
track.episode
track.explicit
track.href
track.is_local
track.name
track.popularity
track.preview_url
track.track
track.track_number
track.type
track.uri
track.album.album_group
track.album.album_type
track.album.artists
track.album.available_markets
track.album.href
track.album.id
track.album.images
track.album.is_playable
track.album.name
track.album.release_date
track.album.release_date_precision
track.album.total_tracks
track.album.type
track.album.uri
track.album.external_urls.spotify
track.external_ids.isrc
track.external_urls.spotify
video_thumbnail.url
key_name
mode_name
key_mode
rank
artist
image

In the next step, we want to take a closer look at track popularity. That is, how does a track’s rank on the top 50 playlist correlate with Spotify’s popularity measure? Note that the index is calculated by Spotify not solely based on a track’s recent stream count, but also taking other metrics into account. Beyond, we’ll have a look at more fun features such as a track’s danceability and valence (happiness).

top50 %>% select(rank, track.name, artist, track.popularity,
                 danceability, valence) %>% 
  kbl() %>%
  kable_styling(bootstrap_options = c("hover")) %>% 
  scroll_box(width = "100%", height = "300px")
rank track.name artist track.popularity danceability valence
1 Die For You - Remix The Weeknd 92 0.531 0.5020
2 TQG KAROL G 92 0.720 0.6070
3 Flowers Miley Cyrus 100 0.707 0.6460
4 Kill Bill SZA 93 0.644 0.4180
5 Boy’s a liar Pt. 2 PinkPantheress 94 0.696 0.8570
6 As It Was Harry Styles 89 0.520 0.6620
7 Shakira: Bzrp Music Sessions, Vol. 53 Bizarrap 96 0.778 0.4980
8 Creepin’ (with The Weeknd & 21 Savage) Metro Boomin 94 0.715 0.1720
9 Yandel 150 Yandel 85 0.783 0.5800
10 Calm Down (with Selena Gomez) Rema 92 0.801 0.8020
11 Unholy (feat. Kim Petras) Sam Smith 84 0.712 0.2060
12 Tormenta (feat. Bad Bunny) Gorillaz 85 0.637 0.2970
13 Shorty Party Cartel De Santa 79 0.928 0.4700
14 La Bachata Manuel Turizo 93 0.835 0.8500
15 I’m Good (Blue) David Guetta 93 0.561 0.3040
16 Die For You The Weeknd 87 0.586 0.5080
17 Last Night Morgan Wallen 77 0.492 0.4780
18 Anti-Hero Taylor Swift 91 0.637 0.5330
19 Escapism. RAYE 78 0.538 0.2500
20 Here With Me d4vd 91 0.574 0.2880
21 OMG NewJeans 91 0.804 0.7390
22 Hey Mor Ozuna 89 0.901 0.3990
23 Until I Found You (with Em Beihold) - Em Beihold Version Stephen Sanchez 90 0.551 0.3420
24 Starboy The Weeknd 89 0.679 0.4860
25 Another Love Tom Odell 60 0.442 0.1390
26 X SI VOLVEMOS KAROL G 81 0.794 0.5750
27 golden hour JVKE 88 0.515 0.1530
28 La Jumpa Arcángel 85 0.887 0.1970
29 Players Coi Leray 89 0.954 0.6240
30 Quevedo: Bzrp Music Sessions, Vol. 52 Bizarrap 91 0.621 0.5500
31 I Wanna Be Yours Arctic Monkeys 90 0.464 0.4790
32 I Ain’t Worried OneRepublic 45 0.697 0.8220
33 AMG Natanael Cano 89 0.772 0.7860
34 ceilings Lizzy McAlpine 86 0.516 0.2610
35 Snooze SZA 89 0.559 0.3920
36 Feliz Cumpleaños Ferxxo Feid 86 0.865 0.5630
37 Sure Thing Miguel 90 0.684 0.4980
38 PRC Peso Pluma 88 0.784 0.8930
39 Bones Imagine Dragons 83 0.772 0.5870
40 Just Wanna Rock Lil Uzi Vert 89 0.486 0.0385
41 on the street (with J. Cole) j-hope 83 0.681 0.8130
42 Ditto NewJeans 89 0.814 0.1830
43 Mockingbird Eminem 89 0.637 0.2540
44 Under The Influence Chris Brown 89 0.733 0.3100
45 Me Porto Bonito Bad Bunny 89 0.911 0.4250
46 Superhero (Heroes & Villains) [with Future & Chris Brown] Metro Boomin 88 0.526 0.4920
47 Bebe Dame Fuerza Regida 80 0.529 0.7560
48 Blinding Lights The Weeknd 88 0.514 0.3340
49 Rich Flex Drake 89 0.561 0.4240
50 Watch This - ARIZONATEARS Pluggnb Remix Lil Uzi Vert 85 0.686 0.3550

Let’s plot this data!

f1 <-
  ggplot(data = top50, aes(x = track.popularity, y = rank, text = (
    paste(
      "Track:",
      track.name,
      "<br>",
      "Artist:",
      artist,
      "<br>",
      "Release date:",
      track.album.release_date
    )
  ))) +
  geom_point() +
  theme_minimal() +
  ylab("Playlist rank") +
  xlab("Popularity") 

# This code snippet creates an interactive version of our plot that allows you to
# hover over each data point to receive more information.
ggplotly(f1, tooltip = c("text"))

We can see in the graph above that tracks in the top 50 playlist are definitely rather on the popular side, however, some tracks have a comparably low popularity score. When you look at the interactive plotly graph, you’ll be able to identify the outlier that ranks below 70 on Spotify’s the popularity scale despite being in the charts: Another Love by Tom Odell (2013),

While the exact estimation of the score is confidential, there exists evidence that the age of a track factors into its popularity score. That way, two tracks with 100,000 streams can have different popularity scores dependent on when they were released. In the algorithm’s logic, the more recent track gained the same number of streams in a shorter time and is therefore evaluated as more popular.

Applying this to our outliers, we can see that the age of both tracks likely affects their low popularity score.

Spotify also provides you with album cover links in varying sizes, so why not use the covers instead of black scatterpoints in a plot? Before we used geom_point for the scatterplot, now we simply need to replace that command with geom_image and specify the variable containing the image link. In the following plot, we explore the correlation between track happiness and danceability.

ggplot(data = top50, aes(x = valence, y = danceability, text = (
    paste(
      "Track:",
      track.name,
      "<br>",
      "Artist:",
      artist
    )
  ))) +
  geom_image(aes(image=image), asp = 1.7) +
  theme_minimal() +
  ylab("Danceability") +
  xlab("Happiness") 

20.4.2 Your Spotify data

You can also analyze your personal listening behavior with the Spotify API. For example, this snippet using the get_my_top_artists_or_tracks function allows you to explore your favorite artists of all time.

## Finding all time favorite artists
topartists <- get_my_top_artists_or_tracks(type = 'artists',
  time_range = 'long_term', limit = 50) %>%
  select(name, genres) %>%
  rowwise %>%
  mutate(genres = paste(genres, collapse = ', ')) %>%
  ungroup
topartists$rank <- seq.int(nrow(topartists)) # add rank variable

topartists %>% 
  select(rank, name, genres) %>% 
  kbl() %>%
  kable_styling(bootstrap_options = c("hover")) %>% 
  scroll_box(width = "100%", height = "300px")

You can retrieve your all-time favourite tracks by still using the get_my_top_artists_or_tracks function but changing type= to tracks.

As you have seen, the Spotify API opens up many opportunities for data analysis. The functionality of the Spotifyr wrapper goes beyond the simple examples demonstrated here. Now it’s your time to explore the data!

20.5 Social science examples

  • Are there social science research examples using the API?

So far, not much has been done with the Spotify API and music data from the field of social sciences (this is where you could step in!). Some notable exceptions below:

  • MacTaggart (2018) and Lacognata and Poole (2021) both looked at the link between music and politics. At an aggregate level, MacTaggart (2018) examined the association of chart trends within pop music and trends within politics in the US from 1959 to 2016. The author finds that musical trends reflect trends in society and politics.

  • Lacognata and Poole (2021) investigated a potential correlation of political orientation, personality traits, and music taste. The authors used individual-level survey data and linked these data to respondents’ Spotify accounts. However, they did not find any association between music taste and neither political orientation nor personality traits.

  • In a recent poster presentation, Song, Chun, and Elkins (2021) examined how the Covid-19 pandemic affected emotion-driven listening behavior with the help of the Spotify AP. The authors used monthly Spotify chart data from December 2019 to December 2021 and extracted measures on the valence, energy, and danceability of the tracks (we already know these measures from our own example on the global top 50 tracks). The authors did not find any correlation of tracks’ energy and danceability with Covid-19 related events. At the same time, Song et al did find that the tracks’ valence reflects the course of the pandemic: When the Covid-19-vaccine got distributed, people listened to happier music, while when the news of the Omicron variant spread, less happy tracks became more popular.

  • For more inspiration, you can also check out Spotify’s Developer Showcase website!

What will you do with the Spotify API?

20.6 References

References

Lacognata, Anthony, and Jennifer Poole. 2021. “The Melodies of Politics: Assessing a Correlation Between Music Taste and Political Views with Spotify.” Journal of Student Research 10 (November). https://doi.org/10.47611/jsrhs.v10i3.1898.
MacTaggart, Andrew. 2018. “Pop Music and Politics: Tracking Political Trends Through the Hot 100 Chart, 1959-2016 - ProQuest.” Harvard University. https://www.proquest.com/docview/2487180443?fromopenview=true&pq-origsite=gscholar&parentSessionId=PiWLVy66xd%2FXPi84svQkAEg0ze49z52W1qhA3%2BBQlVs%3D.
Song, Hemmi, Jon Chun, and Katherine Elkins. 2021. “Tracking the Emotion of Music Across the Covid-19 Pandemic.” In. https://doi.org/10.3109/10673229.2011.549769.