DfE EES Screener API

A containerised Azure Function App consisting of an R Plumber API for the DfE's data screener.

See Request format for details on how to construct API requests.

Running the R services directly

If you have R set up already

pak::lockfile_install() - install dependencies
source("run.R") - set API running
The API healthcheck endpoint will then be live at http://localhost:8000/api/healthcheck

Setup from scratch in VS Code

Download R binary (https://www.stats.bris.ac.uk/R/)
Download the R Extension for VS Code, you may be prompted to download the languageservice to use R code locally. Alternatively you can use an R-specific IDE such as RStudio
Open run.R click the Run button at the top of the file. Alternatively, open an R Terminal and use the command source("run.R")
Open up Postman/PowerShell/curl etc. to hit the endpoints:

GET localhost:8000/api/healthcheck
POST localhost:8000/api/screen
POST localhost:8000/function_start_screening
GET localhost:8000/api/progress?dataSetId=<data set id>

Running locally from the CLI

Ensure that Rscript is executable (check with Rscript --version).
Run: Rscript run.R
Call an endpoint at http://localhost:8000/api/healthcheck.

Running the R services via the Azure Functions runtime

In Docker

The API can also be run in a Docker container that is running the Azure Functions runtime.

Open up a terminal in the root of the project, and create an image using

docker build -t data-screener .

then run it using

docker run --rm \
  --name data-screener \
  --network explore-education-statistics_default \
  -p 7078:80 \
  -e "STORAGE_URL=http://data-storage:10000/devstoreaccount1" \
  -e "STORAGE_CONTAINER_NAME=releases-temp" \
  -e "AzureWebJobs_StartScreening=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://data-storage:10000/devstoreaccount1;QueueEndpoint=http://data-storage:10001/devstoreaccount1;" \
  -e "FUNCTIONS_WORKER_RUNTIME=custom" \
  data-screener

and call the Azure Function healthcheck endpoint at http://localhost:7078/api/healthcheck.

The environment variables are necessary because when run using the mcr.microsoft.com/azure-functions base Docker image, local.settings.json is not used.

ℹ️ The --network parameter used here assumes you are using the storage container configured by the main EES project (see Dependencies > Azurite for details on how to construct API requests for further details).

Locally

The API can also be run directly from a local development environment, assuming that the required dependencies have been installed. This includes:

Azure Functions Core Tools.
RScript, the eesyscreener R package and the various dependencies that eesyscreener will need to run. For a full list of steps to install the dependencies required, refer to the commands executed in the Dockerfile.

After installing the above, the Azure Functions runtime can be started with:

func start

and the API can be called via the Azure Functions runtime by calling:

http://localhost:7071/api/healthcheck

Dependencies

Packages

You will need to install the R packages to run the API locally in R, update the command below and rerun. Make sure to update the Dockerfile and GitHub action as appropriate too as they are not yet working from a lockfile. eesyscreener needs installing separately as it is only available from GitHub currently.

pak::pak("dfe-analytical-services/eesyscreener@v0.2.4")

pak::pak(
  c(
    "plumber",
    # below for testing only
    "testthat",
    "mirai",
    "withr",
    "httr2"
  )
)

Alternative package management

Note on pkg.lock file. This was added as part of development, but is not currently used in workflows.

To update it with the latest versions, you can use the following (updating the eesyscreener version number as needed):

pak::lockfile_create(pkg = c("dfe-analytical-services/eesyscreener@v0.2.4","deps::."))

Restoring packages based on this lockfile, should then be doable using:

pak::lockfile_install()

Azurite

The screener's POST endpoint retrieves files from a local blob storage container based on the paths supplied in the request body. The connection details hard-coded into screen_csvs.R relate to the same storage container used by the main EES solution. This container can be started up by opening a terminal in the main project directory and running the start script, e.g.:

cd source/repos/dfe-analytical-services/explore-education-statistics
pnpm start dataStorage

If using a different storage container, the connection details can be changed by replacing the destination URL, key and container name in the controller. The custom storage container should also be assigned a network, so that the API can be started within the same network to allow cross-container communication.

Request format

The GET endpoint is just a health check to confirm the API is running, and expects no parameters: GET <url>/api/healthcheck.

The POST endpoint at POST <url>/api/screen expects a JSON request body in the following format:

{
  "dataFileName": "data.csv",
  "dataFilePath": "00ffd291-2ff2-4b65-46c5-08dd9ec03382/data/0d5a5bc6-b12c-4ed4-986e-517679b49f88",
  "metaFileName": "meta.data.csv",
  "metaFilePath:": "00ffd291-2ff2-4b65-46c5-08dd9ec03382/data/f9c951bc-85a0-48ab-a0be-8eab3fc8dcee"
}

ℹ️ Path format is <releaseVersionId>/data/<fileId>.

ℹ️ Example files can be found in the "example-data" folder. When running locally (e.g. using Postman Desktop), these can be provided in the json body to dateFilePath and metaFilePath as relative paths within the local repo, e.g. "dataFilePath": "example-data/pass.csv".

Testing

Unit tests have been setup using testthat and mirai, you can run them locally in R using:

testthat::test_dir("tests/testthat")

If one of the environment variables isn't set from "STORAGE_URL", "STORAGE_KEY" or "STORAGE_CONTAINER_NAME". Then the API will fallback to looking a local file, for example you can then supply the paths to the example-data in this repo

{
  "dataFileName": "pass.csv",
  "dataFilePath": "example-data/pass.csv",
  "metaFileName": "pass.data.csv",
  "metaFilePath:": "example-data/pass.meta.csv"
}

Those files should pass reliably, if not, regenerate them using the following lines in R:

write.csv(eesyscreener::example_data, "example-data/pass.csv", row.names = FALSE)
write.csv(eesyscreener::example_meta, "example-data/pass.meta.csv", row.names = FALSE)

For other test files that are available, review the eesyscreener docs and adapt the code above accordingly. For an example failure from the API locally use the fail.csv files:

write.csv(
    eesyscreener::example_data |>
        dplyr::mutate(time_identifier = "parsec"),
    "example-data/fail.csv",
    row.names = FALSE
)
write.csv(eesyscreener::example_meta, "example-data/fail.meta.csv", row.names = FALSE)

request body

{
  "dataFileName": "fail.csv",
  "dataFilePath": "example-data/fail.csv",
  "metaFileName": "fail.meta.csv",
  "metaFilePath:": "example-data/fail.meta.csv"
}

If the data and meta files supplied to the POST endpoint generate an error from eesyscreener, and you only want to generate a successful response for testing, replace the function call in screen_controller.R:

result <- eesyscreener::screen_csv(data_file, meta_file, data_file_name, meta_file_name)

with

write.csv(eesyscreener::example_data, "example_data.csv", row.names = FALSE)
write.csv(eesyscreener::example_meta, "example_data.meta.csv", row.names = FALSE)
result <- eesyscreener::screen_csv("example_data.csv", "example_data.meta.csv")

this will generate some new test data files that should always pass the screening.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github/workflows		.github/workflows
.vscode		.vscode
example-data		example-data
function_check_progress		function_check_progress
function_delete_progress_file		function_delete_progress_file
function_healthcheck		function_healthcheck
function_screen		function_screen
function_start_screening		function_start_screening
src		src
tests/testthat		tests/testthat
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.lintr		.lintr
Dockerfile		Dockerfile
README.md		README.md
azure-pipelines-main.yml		azure-pipelines-main.yml
create_server.R		create_server.R
host.json		host.json
local.settings.json		local.settings.json
pkg.lock		pkg.lock
routes.R		routes.R
run.R		run.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DfE EES Screener API

Running the R services directly

If you have R set up already

Setup from scratch in VS Code

Running locally from the CLI

Running the R services via the Azure Functions runtime

In Docker

Locally

Dependencies

Packages

Alternative package management

Azurite

Request format

Testing

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DfE EES Screener API

Running the R services directly

If you have R set up already

Setup from scratch in VS Code

Running locally from the CLI

Running the R services via the Azure Functions runtime

In Docker

Locally

Dependencies

Packages

Alternative package management

Azurite

Request format

Testing

About

Resources

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages