Automate non-UTF-8 character conversion

We receive non-UTF-8 characters in raw data files. These non-UTF-8 characters are not understood by R and must be manually fixed before contacting HIP registrants. For example, to manually edit the city value `"CA\xd1ON CITY"` in a file from Colorado, we would replace `\xd1` with `N` to get the human readable value of `"CANON CITY"`. This is done by opening the raw file, making the change, saving the file, and re-running `read_hip()`.

Sometimes, it is not obvious as to what non-UTF-8 characters should be changed to. First names, last names, and street names are particularly variable. To ensure we make the correct change, each escape sequence of a hexadecimal byte value must be checked and replaced manually. This is time consuming. 

Create a function that can automatically replace non-UTF-8 characters after reading in raw data, so that:

- Raw data files do not need to be manually edited
- Non-UTF-8 character conversion is automated and fully reproducible
- A full list of non-UTF-8 character replacements can be reviewed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate non-UTF-8 character conversion #90

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Automate non-UTF-8 character conversion #90

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions