Implement S3 bucket cleanup as github action

Whenever you run `flask build-explorer --upload-map-files`, a new set of CSVs (named like `_5A3A14.csv`) are uploaded to S3, and URLs to these files are written into the `sources.json` file that is updated and committed to the explorer directory. After that, and after the netlify deployment of the explorer, the public web map will use these new files.

However, that command has no cleanup steps to remove whatever CSV files were previously in use. (There are probably > 100 unused CSV files in S3 at this point.)

There is a separate operation for cleanup, [`flask clean-explorer-bucket`](https://healthyregions.github.io/oeps/reference/cli/command-line-reference/#clean-explorer-bucket). However, I haven't used that command for a little bit, so it would be good to re-read the code before running it.

I could see this command being placed in a github action that _only_ has `workflow_dispatch`, so a repo admin could just go in and run it periodically... these files are small enough that even after a very long time without cleaning up the bucket, it won't really incur significant costs.

---

Note there are a couple of files in the bucket that should not be deleted, `states.csv` and `counties.csv` (I think that's their names), which is why the generated CSVs have `_` prefixes. The script will read all valid file names from the current `sources.json` file, and then remove all `_*` files that don't match any of the valid file names.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement S3 bucket cleanup as github action #280

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement S3 bucket cleanup as github action #280

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions