Assuming this doesn't exist already, look for and warn if data are repeated for some proportion or more of a file. This check should be implemented for fields including: title, firstname, middle, lastname, suffix, address, zip, and birth_date. See issue #74 for email. A good threshold of suspicious repetition may be 25%, or for something more specific like address, maybe >10?
Assuming this doesn't exist already, look for and warn if data are repeated for some proportion or more of a file. This check should be implemented for fields including:
title,firstname,middle,lastname,suffix,address,zip, andbirth_date. See issue #74 foremail. A good threshold of suspicious repetition may be 25%, or for something more specific likeaddress, maybe >10?