2.2 APIs: Clean and Curated
An application programming interface (API) is a set of functions and procedures which allows one computer program to interact with another. To simplify the concept remarkably, we will consider web-APIs where there is a server called a host (computer waiting to provide data) and a client (computer making a request for data).
The benefit of APIs is the result: clean and curated data from the host. The pre-processing needed to get the data in a workable form is usually already done on the server side. We, however, are responsible for making the request. Web-APIs often utilize JavaScript Object Notation (JSON), another example of non-rectangular data. We will utilize the httr
and the jsonlite
packages to retrieve the latest sports lines from Bovada, an online sportsbook.
Before we start, we’ll need to download the httr
and jsonlite
packages and load them into our current environment. Furthermore, we will need to find the address of the server to which we will send the request.
library(httr, quietly = TRUE) # quietly means don't print any output when loading the library
library(jsonlite, quietly = TRUE)
# URL for api requests:
bov_nfl_api <- "https://www.bovada.lv/services/sports/event/v2/events/A/description/football/nfl"
To ask for data through a web-API, we will need to make a GET
request with the httr
package’s GET()
function. After making the request, we can read about the server’s response.
Response [https://www.bovada.lv/services/sports/event/v2/events/A/description/football/nfl]
Date: 2021-11-12 22:15
Status: 200
Content-Type: application/json;charset=utf-8
Size: 2.21 MB
If the request was successful, then the status of the request will read 200. Otherwise, there was some error with your request. For a list of HTTP status codes and their respective definitions, follow this link. Since the response clarifies that the content is indeed driven by JavaScript, then we will utilize the jsonlite
package to read the JSON structured data. A handy function we will use will be fromJSON()
which converts a character vector containing data in JSON structure to native structures in R
like lists. So, in order, we will
- Extract the content from the server’s response
- Convert the content to a character vector, maintaining the JSON structure
- Restructure the data into native
R
structures, usingfromJSON()
.
content <- bov_req$content
content_char <- rawToChar(content)
bov_res <- jsonlite::fromJSON(content_char)
The website icanhazdadjoke.com provides an API so you can fetch a random “dad joke”. Following the Bovada example above, use httr::GET
to retrieve a joke, where the url is “icanhazdadjoke.com”. Unlike the example above, you will need to add another argument to the GET
function: add_headers(Accept=“application/json”)
, which tells the website to send its response in JSON format. Next, extract the content from the server’s response, convert it to a character string, and use jsonlite::fromJSON
to parse the JSON structure.
Of course, we could also create a function which takes the server’s response and converts the content to native R
structures. We will want to code in a force stop if the response status is not 200. We will also want to require the httr
and jsonlite
packages which will automatically install the packages if a user calls the function without having the packages installed.
convert_JSON <- function(resp){
# call needed packages
require(httr)
require(jsonlite)
# stop if the server returned an error
httr::stop_for_status(resp)
# return JSON content in native R structures
return(jsonlite::fromJSON(rawToChar(resp$content)))
}
Finally, we can get the same output by simply calling the function.
[1] TRUE
Write code to retrieve 100 random dad jokes from icanhazdadjoke.com, and store the result in a list called jokes
. Recall from Module 1 that you can do this with a for-loop, or with the lapply
function. Next, count how many of the random dad jokes you retrieved contain the word “Why” (with a capital “W”). Here’s a hint to get you started: use grepl(“Why”, jokes)
to produce a binary vector, which elements are TRUE
if “Why” occurs in the joke, and FALSE
otherwise. Then count the TRUE
elements using sum
.
Some web-APIs require additional information from us as outlined in the documentation for the API. In this case, the user would need to provide additional query parameters in their GET request. Thankfully, this functionality is ingrained in the httr
package’s GET()
function. For more information on how to include query parameters, type ??GET
into your R
console.
Any feedback for this section? Click here