Source: https://github.com/onc-healthit/patient-matching
The test dataset used in the ONC Patient Matching Algorithm Challenge is available for download by students, researchers, or others interested in additional analysis and patient matching algorithm development. The dataset contains 1 million test patients.
This repository contains the downloaded merged dataset all in one comma-separated values (CSV) file for convenience. The patients are listed by last name alphabetically.
Towards the end of the dataset there are patients with no last name recorded, in various ways, such as with a CSV field that says TRUE or Null or is blank.