The idea here is to create a dictionary of first and last names (and other names too). Then, have a CLI that is designed to be used by https://pre-commit.com/ to check and flag if any new about to be committed files have a name in them somewhere.
Make it so that names are allowed on a line if that line ends in names allowed so that you can flag a line to be allowed to have a name in it by writing # names allowed. Can also have a configuration file that can be used to flag whether or not to ignore the first lines of files if that line contains the text Copyright.
Will design to search for names even if there are no spaces around them, so that names in paths etc. are still flagged. As such only the shortest version of a name need be searched for initially.
Here is an example public database of names that I have found:
https://www.openacademic.ai/oag/
Need to also have a way to verify that all names within the downloaded database catch all names within your clinic. It would be helpful to flag whether or not more public databases need to be found by highlighting names that are within Mosaiq, that are not within the public database. These names are NOT to be used. However, more public databases of names can be found and added until the coverage is complete.
@pchlap, keen for your thoughts. If something like this was running, and it was verified that the public database contained all the names currently in use in your clinic, if all code commits were checked locally before being committed to not contain these names within them, might that be a step in the right direction to allowing public code submissions?
The idea here is to create a dictionary of first and last names (and other names too). Then, have a CLI that is designed to be used by https://pre-commit.com/ to check and flag if any new about to be committed files have a name in them somewhere.
Make it so that names are allowed on a line if that line ends in
names allowedso that you can flag a line to be allowed to have a name in it by writing# names allowed. Can also have a configuration file that can be used to flag whether or not to ignore the first lines of files if that line contains the textCopyright.Will design to search for names even if there are no spaces around them, so that names in paths etc. are still flagged. As such only the shortest version of a name need be searched for initially.
Here is an example public database of names that I have found:
https://www.openacademic.ai/oag/
Need to also have a way to verify that all names within the downloaded database catch all names within your clinic. It would be helpful to flag whether or not more public databases need to be found by highlighting names that are within Mosaiq, that are not within the public database. These names are NOT to be used. However, more public databases of names can be found and added until the coverage is complete.
@pchlap, keen for your thoughts. If something like this was running, and it was verified that the public database contained all the names currently in use in your clinic, if all code commits were checked locally before being committed to not contain these names within them, might that be a step in the right direction to allowing public code submissions?