9.2 Lab: Using Google ML APIs

In this lab we’ll use Twitter data to illustrate the use of different Google ML APIs (translation, sentiment coding, syntax analysis, image analysis). Thereby, the lab provides a quick overview rather than a deep dive into the single APIs.

9.2.1 Software

  • googleLanguageR: Interact with the Google Natural Language API
    • See the vignette for an overview
    • Google Natural Language API
      • Entity analysis (i.e., finds named entities, types, salience, mentions + properties, metadata)
      • Syntax (i.e., syntax analysis, e.g., identify nouns)
      • Sentiment (i.e., provides sentiment scores)
      • Content Classification (i.e., content classification into categories)
    • Google Cloud Speech-to-Text API
    • Google Cloud Text-to-Speech API
    • Google Cloud Translation API
  • googleCloudVisionR: Interact with the Google Vision API

9.2.3 Twitter: Authenticate & load data

In order get access to the Twitter API you need to create an app and generate the corresponding API keys on the Twitter developer platform. See slides and lab on Twitter starting with X-Twitter’s APIs (we didn’t go through them). Here we’ll merely download a few tweets to explore the Google ML APIs.

  • In case you don’t have a Twitter developer account you can also download the data further below!

We’ll work with tweets by Alice Weidel (Afd) and tweets from Martin Schulz (SPD). The tweets themselves are text data that we can analyze using the Google Natural Language APIs.

If you can’t authenticate with Twitter download the data data_tweets.RData from the material folder, store the file in your working directory and load it into R with the command below:

9.2.4 Google: Authenticate

Remember the instructions for setting up your Google research credits and Google API access.

Fill in the the quotation marks with the directory where the created JSON-File is located & read in the JSON-File (gl_auth).

9.2.5 Translation API

We can use the Cloud Translation API to translate the tweets from German to English (other languages can of course be choosen as well. Check the language codes under the following link and replace the string “de” in the target command: https://developers.google.com/admin-sdk/directory/v1/languages.

9.2.7 NLP API: Syntax

The Google NLP API also allows for analyzing syntax. This is extremely helpful as sometimes we may want to isolate certain parts of a sentence. Below we extract the nouns and subsequently plot them using a wordcloud.

9.2.8 Analyzing images

In addition, tweets may contain images that we can analyze using the Google Vision API. Each tweet comes with an image (and the corresponding link) or not.

How many out of tweets in the data (of the two politicians) contain images?

Who uses more images?

Below we load those images and store them in a directory (we should always store the data we analyze locally).

Next we try to recognize text entities in the text (we do so directly for the image urls - not downloading the data):

Then we merge the list of words (for every image into sentences).

And we add it to the original tweet dataset:

We also try to recognize objects in those images:

Then we turn the list of objects into a string variable:

Then we join the scraped data with the original tweet data:

9.2.9 References