Skip to content

Add tracking for files screened #135

@cjrace

Description

@cjrace

We should set up a database table that we populate with the following columns:

  • environment - e.g. 'local', development, 'production', shinyapps.io (can set environment variables to control)
  • data_filename - pulled from the csv data file
  • file_size - we should just track the raw bytes so it's easier to analyse, rather than the dfeR::pretty_filesize() output
  • rows_ count - same thing we put in the UI
  • cols_count - same thing we put in the UI
  • stage - what stage of the checks the file got to
  • pass - boolean, did it pass
  • warnings - string of the checks that had a warning (take the name from the test col of the screening output table, e.g. "ethnicity_values, total, ob_unit_meta" (we can then parse this if we want to splitting by commas if we want to analyse what warnings are happening a lot)
  • time_started - time recorded at start of screening
  • time_ended - time recorded at end of screening
  • screening_time - count of time taken to screen in raw seconds calculated from above two cols, I think we'd want a raw value like seconds that we can ees-ily average / aggregate

We then add code into the server side of the app that will write a new row into the database table after each screening.

Example UI output now
image

Example screening output table (where we can use the test column to pull from and populate with what warnings are present (as will be interesting thinking of API standards), also shows the 'stages' we have
image

Probable tasks

Easy-ish first tasks

  • Add time tracking into the screener as is (log time at start of screening, log time at end, present in UI with dfeR::pretty_time_taken()
  • Check environment variables exist
  • Check all data above exists in app

Main tasks

  • Create new database table
  • Add a connection into the app to point to our existing SQL databases (can migrate to databricks at a later point)
  • Add code into server file that gathers all of this and writes a new row into the database

Metadata

Metadata

Assignees

No one assigned

    Labels

    new featureNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions