2.2 APIs: Clean and Curated

An application programming interface (API) is a set of functions and procedures which allows one computer program to interact with another. To simplify the concept remarkably, we will consider web-APIs where there is a server called a host (computer waiting to provide data) and a client (computer making a request for data).

The benefit of APIs is the result: clean and curated data from the host. The pre-processing needed to get the data in a workable form is usually already done on the server side. We, however, are responsible for making the request. Web-APIs often utilize JavaScript Object Notation (JSON), another example of non-rectangular data. We will utilize the httr and the jsonlite packages to retrieve the latest sports lines from Bovada, an online sportsbook.

Before we start, we’ll need to download the httr and jsonlite packages and load them into our current environment. Furthermore, we will need to find the address of the server to which we will send the request.

library(httr, quietly = TRUE)  # quietly means don't print any output when loading the library
library(jsonlite, quietly = TRUE)
# URL for api requests:
bov_nfl_api <- "https://www.bovada.lv/services/sports/event/v2/events/A/description/football/nfl"

To ask for data through a web-API, we will need to make a GET request with the httr package’s GET() function. After making the request, we can read about the server’s response.

bov_req <- httr::GET(url = bov_nfl_api)
bov_req
Response [https://www.bovada.lv/services/sports/event/v2/events/A/description/football/nfl]
  Date: 2021-11-12 22:15
  Status: 200
  Content-Type: application/json;charset=utf-8
  Size: 2.21 MB

If the request was successful, then the status of the request will read 200. Otherwise, there was some error with your request. For a list of HTTP status codes and their respective definitions, follow this link. Since the response clarifies that the content is indeed driven by JavaScript, then we will utilize the jsonlite package to read the JSON structured data. A handy function we will use will be fromJSON() which converts a character vector containing data in JSON structure to native structures in R like lists. So, in order, we will

  1. Extract the content from the server’s response
  2. Convert the content to a character vector, maintaining the JSON structure
  3. Restructure the data into native R structures, using fromJSON().
content <- bov_req$content
content_char <- rawToChar(content)
bov_res <- jsonlite::fromJSON(content_char)

The website icanhazdadjoke.com provides an API so you can fetch a random “dad joke”. Following the Bovada example above, use httr::GET to retrieve a joke, where the url is “icanhazdadjoke.com”. Unlike the example above, you will need to add another argument to the GET function: add_headers(Accept=“application/json”), which tells the website to send its response in JSON format. Next, extract the content from the server’s response, convert it to a character string, and use jsonlite::fromJSON to parse the JSON structure.

Of course, we could also create a function which takes the server’s response and converts the content to native R structures. We will want to code in a force stop if the response status is not 200. We will also want to require the httr and jsonlite packages which will automatically install the packages if a user calls the function without having the packages installed.

convert_JSON <- function(resp){
  # call needed packages
  require(httr)
  require(jsonlite)
  # stop if the server returned an error
  httr::stop_for_status(resp)
  # return JSON content in native R structures
  return(jsonlite::fromJSON(rawToChar(resp$content)))
}

Finally, we can get the same output by simply calling the function.

identical(convert_JSON(bov_req), bov_res)
[1] TRUE

Write code to retrieve 100 random dad jokes from icanhazdadjoke.com, and store the result in a list called jokes. Recall from Module 1 that you can do this with a for-loop, or with the lapply function. Next, count how many of the random dad jokes you retrieved contain the word “Why” (with a capital “W”). Here’s a hint to get you started: use grepl(“Why”, jokes) to produce a binary vector, which elements are TRUE if “Why” occurs in the joke, and FALSE otherwise. Then count the TRUE elements using sum.

Some web-APIs require additional information from us as outlined in the documentation for the API. In this case, the user would need to provide additional query parameters in their GET request. Thankfully, this functionality is ingrained in the httr package’s GET() function. For more information on how to include query parameters, type ??GET into your R console.