4.13 Exercise: Scraping data from APIs

  1. Meetup is a website on which users can organize meetings. Find out if meetup has an API and whether there are any R packages that may help analyzing data from this API. Try to scrape some meetup data using the API.

  2. In the previous lab “Lab: Scraping data from APIs” we discussed the New York Times API and the role of functions/loops in collecting data from such APIs.

    1. Please create an account in order to get an API key: https://developer.nytimes.com/
    2. In the lab we defined two functions nyt_count and nyt_years_count (which uses/requires nyt_count). First, closely inspect the two functions, i.e., what steps they contain etc. Second, try to modify the function nyt_years_count so that you can feed it with a vector of search terms (instead of a single search term). The new function should return a data frame that contains the search terms as variables and the counts (over the years) in the rows (if you store results in a lists you can use as.data.frame(LIST) to convert the list to a dataframe ). Call this new function nyt_years_count_qlist.

4.13.1 Homework: APIs for social scientists

Objective: Collaborative writing of API reviews

For now the idea is to keep the corresponding reviews as intel for our course. Later on we can vote on whether it makes sense to publish them in some form.

The idea is to pick an API that interests you and to find out whether/how it can be accessed. Potentially, you might have to switch from the first API that you focus on, because there is no way to easily access it. Often you will have to register for api access in some way (potentially this could take a few days).

The review should answer the following questions:

  • Who provides the API?
  • What data/service is provided by the API?
  • What are the prerequisites to access the API (authentication)?
  • What does a simple API call look like?
  • How can we access the API from R (httr GET/POST + other packages)?
  • Are the social science research examples using the API if any?

“Nyhuis 2021: 4.a more complex example: collecting data from twitter” is one example of such a review.