Chapter 9 Getting data from APIs
9.1 Overview
The is session is all about getting data! In this practical session you will develop a crime analysis using data from the UK Police website / API, and then from the Nomis census API to link crime data to socio-economic variables.
You should note a few things for the crime data:
- the crime data from https://data.police.uk/data/ - it comes in monthly chunks;
- functions are used to extract the location and crime type from the data from the
RCurlandrjsonlitepackages; - you will extract data on a particular kind of crime;
You should note a few other things for the population Census API:
- Nomis is the official population census site: https://www.nomisweb.co.uk. You should have a look if you are not familiar.
- Census data are available for different years, and over different geographies
- importantly census data are summarised in different ways
- here, the
nomisrpackage is used to access the Nomis web API - variables are extracted to support the creation of a Townsend measure of deprivation
The final section brings the crime and deprivation index together, and in so doing the practical suggests how data from different sources can be combined, to develop geocomputational models of socio-economic processes for example.
9.2 Packages and Data
You will need to load the following packages for this practical. Some may need installing but you are experienced at this now.
packages <- c("httr", "jsonlite","sf", "tmap", "tidyverse")
# check which packages are not installed
not_installed <- packages[!packages %in% installed.packages()[, "Package"]]
# install missing packages
if (length(not_installed) > 1) {
install.packages(not_installed, repos = "https://cran.rstudio.com/", dep = T)
}
# load packages
library(httr)
library(jsonlite)
library(sf)
library(tmap)
library(tidyverse)
select = dplyr::select # to force the dplyr `select` functionThe nomisr package is slightly tricky but can be installed with the remotes package. It is probably best if you do not update the pacakges if prompted:
You will need the following data in your working directory from the VLE: leeds_lsoa.gpkg.
9.3 The Police API
9.3.1 Getting and mapping crime data
The first thing is to download some data from the police API. The code below downloads data for August 2025 for an area around a location 53.7997 North and -1.5492 West (do you know where this is?).
# specify the url - the web address
url = paste0("http://data.police.uk/api/crimes-street/all-crime",
"?lat=53.7997",
"&lng=-1.5492",
"&date=2025-08")
# use the GET function to "get" the url response object
# (see the help for GET)
x = GET(url)
# finally extract and assign to a data table
crimes <- as_tibble(
fromJSON(httr::content(x, as = "text", encoding = "utf8"),
flatten = T
)
)Now before we investigate what has been downloaded, have a look at the web address in url:
To understand what is going on here have a look at how the call to the Police API is formed for street crime at url(https://data.police.uk/docs/method/crime-street/). In the url object above, notice the use of the ‘?’ and the ‘&’ to construct the query with lat, lng and date.
We can examine the crimes object:
## # A tibble: 1,446 × 13
## category location_type context persistent_id id location_subtype month
## <chr> <chr> <chr> <chr> <int> <chr> <chr>
## 1 anti-socia… Force "" "" 1.31e8 "" 2025…
## 2 anti-socia… Force "" "" 1.31e8 "" 2025…
## 3 anti-socia… Force "" "" 1.31e8 "" 2025…
## 4 anti-socia… Force "" "" 1.31e8 "" 2025…
## 5 anti-socia… Force "" "" 1.31e8 "" 2025…
## 6 anti-socia… Force "" "" 1.31e8 "" 2025…
## 7 anti-socia… Force "" "" 1.31e8 "" 2025…
## 8 anti-socia… Force "" "" 1.31e8 "" 2025…
## 9 anti-socia… Force "" "" 1.31e8 "" 2025…
## 10 anti-socia… Force "" "" 1.31e8 "" 2025…
## # ℹ 1,436 more rows
## # ℹ 6 more variables: location.latitude <chr>, location.longitude <chr>,
## # location.street.id <int>, location.street.name <chr>,
## # outcome_status.category <chr>, outcome_status.date <chr>
The data is in tibble format with a number of attributes:
The 2 location attributes are of particular interest here. The functions below extract the coordinates and the attributes and renders them into a flat data table format:
# 1. Get location
getLonLat <- function(x) {
df = data.frame(lon = as.numeric(x$location.longitude),
lat = as.numeric(x$location.latitude))
# return the dataframe
return(df)
}
# 2. Get attributes
getAttr <- function(x) {
df = data.frame(
category = x$category,
street_name = x$location.street.name,
location_type = x$location_type,
month = substr(x$month, 6,9),
year = substr(x$month, 1,4))
# return the data.frame
return(df)
}The workings of these functions can be investigated for example by assigning crimes to x and running lines of the code:
Then they can be applied to the crimes object:
Finally a function is defined that uses the location information and the attributes to create a spatial object in sf format:
# join together and make a spatial (sf) object
makeSpatial = function(crimes.loc, crimes.attr){
# create a data frame
df = data.frame(longitude = crimes.loc[,1],
latitude = crimes.loc[,2],
crimes.attr)
# convert to sf
df_sf = st_as_sf(df, coords = c("longitude", "latitude"),
crs = 4326, agr = "constant")
# return the sf
return(df_sf)
}
# and apply
crimes_sf = makeSpatial(crimes.loc, crimes.attr)It is possible to see the counts of different crime types using the table function:
##
## anti-social-behaviour bicycle-theft burglary
## 111 34 47
## criminal-damage-arson drugs other-crime
## 60 72 17
## other-theft possession-of-weapons public-order
## 116 10 123
## robbery shoplifting theft-from-the-person
## 39 301 43
## vehicle-crime violent-crime
## 63 410
And using the code below you can extract a crime type and plot them on a map - note the use of the alpha parameter so that crime densities are shown1. This is need:
## ℹ tmap mode set to "view".
## ℹ tmap mode set to "plot".
9.3.2 Getting and mapping more crime data
In the above example crime data has just been obtained for a single month. It is very easy to get data for a longer period, a year for example.
This is done by putting the operations above into a loop. The code below extends the single month to a year (1 to 12 months in the loop, for 2024) simply by passing a different date variable to the getForm function, and appends the answer to a list.This is done for crimes that involve possession-of-weapons.
Note the while loop in the middle to test for a lack of a server-side error. This would blow the loop out. The while loop tests for this and if present repeated queries the server until the data are returned.
# create empty vectors for the results
# these will convert to a data.table in the first iteration
# of the loop and then are subsequently added to
crimes.loc.tab = vector()
crimes.attr.tab = vector()
for (i in 1:12) {
# create the date
date.i <- paste0("2024-",i)
# pass to the API
url.i <- paste0("http://data.police.uk/api/crimes-street/all-crime",
"?lat=53.7997",
"&lng=-1.5492",
"&date=", date.i)
x.i = GET(url.i)
while (x.i$status_code == 500) {
x.i = GET(url.i)
}
crimes.i <- as_tibble(
fromJSON(httr::content(x.i, as = "text", encoding = "utf8"),
flatten = T
)
)
# add the result to the results
crimes.loc.tab <- rbind(crimes.loc.tab, getLonLat(crimes.i))
crimes.attr.tab <- rbind(crimes.attr.tab, getAttr(crimes.i))
# print out a little indicator of progress
cat("downloaded month", i, "\n")
}## downloaded month 1
## downloaded month 2
## downloaded month 3
## downloaded month 4
## downloaded month 5
## downloaded month 6
## downloaded month 7
## downloaded month 8
## downloaded month 9
## downloaded month 10
## downloaded month 11
## downloaded month 12
Then you can have a look at the data, convert to an sf spatial object and map the results:
## lon lat
## 1 -1.568211 53.80497
## 2 -1.549269 53.80708
## 3 -1.539266 53.80669
## 4 -1.536202 53.79755
## 5 -1.538529 53.80467
## 6 -1.549222 53.79872
crimes_sf_2021 = makeSpatial(crimes.loc.tab, crimes.attr.tab)
bike_nickers.pts <- crimes_sf_2021[crimes_sf_2021$category=="bicycle-theft",]
tmap_mode("view")## ℹ tmap mode set to "view".
## ℹ tmap mode set to "plot".
9.3.3 Getting and mapping lots of crime data
The above extension got data from the API for a longer time period and grouped the results to show patterns for the year. However this was for an area approximately a mile around a single location. It is possible to further extend this spatially by defining a bounding box or a polygon to get the data. The code below reads in some LSOA data for Leeds and then extracts some contiguous LSOAs as an area to investigate:
# read in a Leeds LSOA
leeds = st_read("leeds_lsoa.gpkg", quiet = T)
# transform to lat / lon - WGS84
leeds = st_transform(leeds, 4326)
# set up list of LSOAs
codes = c("E01011351", "E01011352", "E01011353", "E01011356", "E01011359")
# extract from leeds
poly.temp <- leeds %>% filter(code %in% codes)
# have a look
tmap_mode("view")## ℹ tmap mode set to "view".
## ℹ tmap mode set to "plot".
We can now use these polygons as an area to extract crimes for by slightly changing the arguments we pass to the API call. First the coordinates of the bounding box for this area are extracted
## xmin ymin xmax ymax
## -1.542430 53.816416 -1.515997 53.835344
These are used to create a sequence of coordinates to be passed to the API, of the box - notice how the first coordinate pair are repeated at the end to close the box:
X = round(c(bb[1], bb[1], bb[3], bb[3], bb[1]), 3)
Y = round(c(bb[2], bb[4], bb[4], bb[2], bb[2]), 3)
poly_paste <- paste(paste(Y, X, sep = ","), collapse = ":")
poly_paste## [1] "53.816,-1.542:53.835,-1.542:53.835,-1.516:53.816,-1.516:53.816,-1.542"
Finally these can be passed to the API using the poly= parameter:
url = paste0("https://data.police.uk/api/crimes-street/all-crime?poly=",
poly_paste,
"&date=2025-08")
x = GET(url)
crimes <- as_tibble(
fromJSON(httr::content(x, as = "text", encoding = "utf8"),
flatten = T
)
)
crimes## # A tibble: 220 × 13
## category location_type context persistent_id id location_subtype month
## <chr> <chr> <chr> <chr> <int> <chr> <chr>
## 1 anti-socia… Force "" "" 1.31e8 "" 2025…
## 2 anti-socia… Force "" "" 1.31e8 "" 2025…
## 3 anti-socia… Force "" "" 1.31e8 "" 2025…
## 4 anti-socia… Force "" "" 1.31e8 "" 2025…
## 5 anti-socia… Force "" "" 1.31e8 "" 2025…
## 6 anti-socia… Force "" "" 1.31e8 "" 2025…
## 7 anti-socia… Force "" "" 1.31e8 "" 2025…
## 8 anti-socia… Force "" "" 1.31e8 "" 2025…
## 9 anti-socia… Force "" "" 1.31e8 "" 2025…
## 10 anti-socia… Force "" "" 1.31e8 "" 2025…
## # ℹ 210 more rows
## # ℹ 6 more variables: location.latitude <chr>, location.longitude <chr>,
## # location.street.id <int>, location.street.name <chr>,
## # outcome_status.category <chr>, outcome_status.date <chr>
Again you chould have a look at the url and even use the BROWSE function to explore it as before. And you can create a spatial object and map it:
# create the spatial object
crimes.loc <- getLonLat(crimes)
crimes.attr <- getAttr(crimes)
crimes_sf = makeSpatial(crimes.loc, crimes.attr)
# and map
tmap_mode("view")## ℹ tmap mode set to "view".
## ℹ tmap mode set to "plot".
Notice how the bounding box pulls data from outside the areas. This can be subsetted:
## ℹ tmap mode set to "view".
## ℹ tmap mode set to "plot".
9.3.4 Getting and mapping more crime data
Now there are are limits to what can be passed to the API in terms of the complexity of the definition of a polygon2. So the code below creates a 5km grid for the area of Leeds and passes each grid in turn to the API, using yet another for loop.
# transform back to OSGB
leeds = st_transform(leeds, 27700)
# create a grid: 1. the geometry
gr_geom = st_make_grid(leeds, cellsize = 5000)
# 2. a data frame of IDs
gr = data.frame(ID = 1:length(gr_geom))
# 3. apply the geometry to the data.frame to make an sf object
st_geometry(gr) = gr_geomYou could examine this:
## ℹ tmap mode set to "view".
## ℹ tmap mode set to "plot".
The function below extracts the coordinates from each grid cell and formats them so that they can be passed to the API with a poly call.
get_poly_coords = function(x){
# transform to lat lon
x = st_transform(x, 4326)
# extract coordinates
coords = data.frame(st_coordinates(x)[, c("X", "Y")])
poly_paste <- paste(paste(coords$Y, coords$X, sep = ","), collapse = ":")
return(poly_paste)
}To test this and to show what the above function is doing, examine this for a single grid cell:
## [1] "53.699629746901,-1.80124066773008:53.6994803352637,-1.72550965108028:53.7444201684409,-1.72521713914431:53.7445698244954,-1.80102893467678:53.699629746901,-1.80124066773008"
# URL for leeds area
url=paste0("https://data.police.uk/api/crimes-street/all-crime?poly=",
coords,
"&date=2025-08")
x = GET(url)
crimes <- as_tibble(
fromJSON(httr::content(x, as = "text", encoding = "utf8"),
flatten = T
)
)
# extract data
crimes.loc <- getLonLat(crimes)
crimes.attr <- getAttr(crimes)
crimes_sf = makeSpatial(crimes.loc, crimes.attr)
# and map
tmap_mode("view")
tm_shape(gr[1,])+ tm_borders() +
tm_shape(crimes_sf) + tm_dots()+
tm_basemap('OpenStreetMap')Finally this can be put into a loop for all the grid cells. Again note the while loop in the middle to avoid server-side errors.
# define some results tables as before
crimes.loc.tab = vector()
crimes.attr.tab = vector()
for(i in 1:nrow(gr)){
coords = get_poly_coords(gr[i,])
url.i=paste0("https://data.police.uk/api/crimes-street/all-crime?",
"poly=", coords,
"&date=2025-08")
x.i = GET(url.i)
while (x.i$status_code == 500) {
x.i = GET(url.i)
}
crimes.i <- as_tibble(
fromJSON(httr::content(x.i, as = "text", encoding = "utf8"),
flatten = T)
)
crimes.loc.tab <- rbind(crimes.loc.tab, getLonLat(crimes.i))
crimes.attr.tab <- rbind(crimes.attr.tab, getAttr(crimes.i))
# print out a little indicator of progress
cat("grid cell", i, "done \n")
}## grid cell 1 done
## grid cell 2 done
## grid cell 3 done
## grid cell 4 done
## grid cell 5 done
## grid cell 6 done
## grid cell 7 done
## grid cell 8 done
## grid cell 9 done
## grid cell 10 done
## grid cell 11 done
## grid cell 12 done
## grid cell 13 done
## grid cell 14 done
## grid cell 15 done
## grid cell 16 done
## grid cell 17 done
## grid cell 18 done
## grid cell 19 done
## grid cell 20 done
## grid cell 21 done
## grid cell 22 done
## grid cell 23 done
## grid cell 24 done
## grid cell 25 done
## grid cell 26 done
## grid cell 27 done
## grid cell 28 done
## grid cell 29 done
## grid cell 30 done
## grid cell 31 done
## grid cell 32 done
## grid cell 33 done
## grid cell 34 done
## grid cell 35 done
## grid cell 36 done
## grid cell 37 done
## grid cell 38 done
## grid cell 39 done
## grid cell 40 done
## grid cell 41 done
## grid cell 42 done
And as ever the results can be converted to an sf spatial object and mapped:
## ℹ tmap mode set to "view".
## Registered S3 method overwritten by 'jsonify':
## method from
## print.json jsonlite