When developing R packages that interact with web APIs, care is required due to CRAN’s policy on internet connectivity. If your not careful, your package could fail CRAN checks on submission or later down the road. If either of these cases happen, you have to re-submit your package with the issues fixed or face the possibility of your package not being put on CRAN or if its already there being archived (removed) from CRAN. Having been bitten in the past on the {ucimlrepo}
package by this policy, I’ve learned some valuable lessons about implementing robust internet-dependent functionality while satisfying CRAN’s policies.
The CRAN Policy
CRAN’s policy states that:
Packages which use Internet resources should fail gracefully with an informative message if the resource is not available or has changed (and not give a check warning nor error).
This specifically refers to behavior during R CMD check
, not necessarily during regular package usage.
Surviving CRAN Checks
For the remainder of the post, we’ll focus on how to handle internet connectivity issues in R packages that interact with web APIs to meet CRAN’s requirements and our own for retaining informative error messages during regular usage. To that end, we’ve created a flow chat showing the desired implementation for a Web API function that handles successful and error scenarios gracefully. The blue-colored decision nodes (internet check and API request) represent critical decision points in the flow, while the gray nodes indicate process steps. We’ve also included two subgraphs to show the balance between regular usage and CRAN check environments.
Documentation Examples That Won’t Fail CRAN Checks
When writing documentation examples for functions that require internet connectivity, we need to be cognizant of CRAN’s check environment. In particular, CRAN runs package checks in a non-interactive environment where internet access may be limited or unavailable. For a Web API package, the limitation of internet access is hugely problematic. So, we need to ensure that our examples do not fail during package checking by designing conditions that allow the examples to run only in interactive sessions.
The overview of the process is shown in the flowchart below:
Modern Approach
When writing example code that requires internet connectivity, you have a few options to ensure that the examples run smoothly during package checking. The preferred method is to use conditional execution through @exampleIf
tag from the roxygen2
package, e.g.
#' @examplesIf some_condition()
#' my_function()
#' another_function()
This will only run the examples if some_condition()
is TRUE
.
For web API functions, we need to check for internet connectivity, interactivity, and, if necessary, an API key. However, since CRAN checks are non-interactive, we can rely on interactive()
to check for an interactive session. For greater peace of mind, we can also check for internet connectivity using curl::has_internet()
function.
Option 1: Only run examples in interactive sessions to avoid the examples failing during package checking.
#' @examplesIf interactive()
#' fetch_api_data("some_endpoint")
Option 2: Check if the package is running in an interactive session and for internet connectivity before running the examples.
#' @examplesIf interactive() && curl::has_internet()
#' fetch_api_data("some_endpoint")
Option 3: Check interactivity, internet connectivity, and an API key being set in the environment if the API requires authorization.
#' @examplesIf interactive() && curl::has_internet() && Sys.getenv("API_KEY") != ""
#' fetch_secure_data("premium/endpoint")
Order matters in the @examplesIf
tag. The first condition that fails will prevent the examples from running. Thus, if you have multiple conditions, place the most restrictive condition first.
For more details, refer to the Chapter 16 of the R Packages (2e) book or the {roxygen2} Documenting functions vignette.
The Legacy Approach
While less elegant, this approach still works at the expense of maintainability since it requires a selection statement to be directly placed around the example code, e.g.
#' @examples
#' \donttest{
#' if (interactive()) {
#' my_api_function("some_endpoint")
#' }
#' }
Web API Testing Strategy
When testing R packages that interact with web APIs, you need to consider how to handle internet connectivity issues, the availability of the API responses, and the need to test with real data. There are several strategies you can use to ensure your package functions correctly under various conditions.
For a high-level overview, we can use a state diagram to illustrate the process of testing web API functionality in R packages. The diagram shows the steps involved in checking for internet connectivity, running tests with real or mocked data, and handling the test results.
Skip Tests When Offline
The quickest way to handle internet connectivity issues is to skip tests that require internet connectivity. So, if the user is offline, the tests will be not run. We can use skip_if_offline()
from the {testthat}
package in the form of:
test_that("API connection works", {
skip_if_offline()
# Your test code here
})
Mock API responses
For APIs that might change or are rate-limited, you can test your package by mocking API responses using local_mocked_bindings()
from {testthat}
package. When we mock the API responses, we can control the response data and status code. This allows you to test your package without requiring internet connectivity OR real API responses at the cost of maintaining the mock to the real data.
test_that("API checked with mocked data", {
# Using testthat's mocking
local_mocked_bindings(
fetch_api_data = function(...) {
list(
status_code = 200,
content = '{"users": [{"id": 1, "name": "Test"}]}'
)
}
)
<- fetch_api_data("users")
result expect_equal(result$name, "Test")
})
Record Real API Responses
vcr
Use the {vcr}
package to record real API responses and replay them during testing. This allows you to test your package with real data without requiring internet connectivity
test_that("API integration works with real data", {
::use_cassette("user_api_response", {
vcr<- fetch_api_data("users")
result
})
expect_gt(nrow(result), 0)
})
For more details, see {vcr}
Vignette: Introduction to vcr.
httptest2
We could also use the httptest2
package to record real API responses and replay them during testing. Similar to before, this will allow you to test your package with real data without requiring internet connectivity. Though, the syntax is different from {vcr}
but the concept is the same.
with_mock_dir("person", {
test_that("We can get people", {
<- fetch_api_data("users")
result expect_gt(nrow(result), 0)
}) })
For more details, see httptest2
: A Test Environment for HTTP Requests in R.
Regular Usage
While CRAN requires graceful failures during package checking, your actual package functions should still provide meaningful errors when things go wrong. For example, we can use a tryCatch()
block to handle API request failures and provide informative error messages to the user.
<- function(endpoint) {
fetch_api_data tryCatch(
make_api_request(endpoint),
error = function(e) {
::cli_abort(
clic(
"x" = "API request failed: {endpoint}",
"i" = "Error message: {conditionMessage(e)}",
">" = "Check the API documentation or try again later."
)
)# Or stop("API request failed: ", endpoint, "\n", conditionMessage(e))
}
) }
Fin
Developing R packages that interact with web APIs requires a delicate balance being struck between providing a good user experience and meeting CRAN’s requirements. From writing documentation examples that won’t fail CRAN Checks using @examplesIf
, implementing a solid unit tests, and providing informative error messages, you can create R packages that handle internet connectivity issues gracefully and don’t trigger the ire of CRAN. There’s always more hiccups that can occur, but these strategies will help you navigate the waters of web API R package development.
Again, CRAN’s requirements specifically targets package checking behavior – not your actual package functions! So, please make sure to provide meaningful errors and feedback during regular usage.