Context
Raised by Niklas Rodemund in the context of the SICdb 2 → OMOP ETL:
For many values (e.g., lab results), there are at least four time points:
- the sample collection time (the lab value corresponds to this time),
- the measurement time (which may include correction factors),
- the charting time (when the result becomes available in the EHR),
- and the order time (interesting for process-related questions).
Ultimately, a model might need multiple times - at least including the charting time - since otherwise data leakage would occur.
OMOP doesn't seem to take this into account - do we have a plan?
Short answer
OMOP CDM v5.4 does not natively expose multiple timestamps per measurement, but a clean ETL pattern exists by combining the SPECIMEN table and additional OBSERVATION rows linked via the v5.4 *_event_id mechanism.
What the OMOP spec actually says
MEASUREMENT.measurement_datetime semantics are explicitly defined (CDM v5.4 MEASUREMENT):
"If there are multiple dates in the source data associated with a record such as order_date, draw_date, and result_date, choose the one that is closest to the date the sample was drawn from the patient."
→ measurement_datetime ← sample collection time.
Why use the SPECIMEN table?
In OMOP, MEASUREMENT represents the measured value (e.g. glucose = 1.2 g/L), while SPECIMEN represents the physical sample itself (this blood tube, drawn at 08:32, from this anatomic site, of this volume). One specimen typically produces multiple measurements — a single EDTA tube yields ~20 CBC values. SPECIMEN avoids duplicating the collection metadata across all derived measurement rows and lets you store additional collection properties (specimen_concept_id, quantity, anatomic_site_concept_id, disease_status_concept_id).
Proposed ETL convention for INDICATE
| # |
Source datetime |
Target in OMOP |
| 1 |
Sample collection time |
MEASUREMENT.measurement_datetime and SPECIMEN.specimen_datetime |
| 2 |
Analysis / measurement time (analyser) |
OBSERVATION row, observation_concept_id = 3043556 (Date of analysis of unspecified specimen, LOINC 45353-0) |
| 3 |
Charting / result-reported time |
OBSERVATION row, observation_concept_id = 1175684 (Date and time lab result reported, LOINC 90056-3) |
| 4 |
Order time |
OBSERVATION row, observation_concept_id = 42529317 (Lab order date, LOINC 82785-7) |
All target concepts are standard LOINC concepts with domain_id = Observation.
Linkage (v5.4 native mechanism)
- MEASUREMENT → SPECIMEN: set
MEASUREMENT.measurement_event_id = SPECIMEN.specimen_id and MEASUREMENT.meas_event_field_concept_id = 1147822 (concept representing the SPECIMEN table).
- OBSERVATION → MEASUREMENT: set
OBSERVATION.observation_event_id = MEASUREMENT.measurement_id and OBSERVATION.obs_event_field_concept_id to the concept representing the MEASUREMENT table.
This is preferred over FACT_RELATIONSHIP, which was the pre-v5.4 workaround (forum thread). v5.4's *_event_id mechanism supersedes that workaround.
References
@MaximMoinat — could you validate this proposed convention as INDICATE's OMOP reference?
Context
Raised by Niklas Rodemund in the context of the SICdb 2 → OMOP ETL:
Short answer
OMOP CDM v5.4 does not natively expose multiple timestamps per measurement, but a clean ETL pattern exists by combining the
SPECIMENtable and additionalOBSERVATIONrows linked via the v5.4*_event_idmechanism.What the OMOP spec actually says
MEASUREMENT.measurement_datetimesemantics are explicitly defined (CDM v5.4 MEASUREMENT):→
measurement_datetime← sample collection time.Why use the SPECIMEN table?
In OMOP, MEASUREMENT represents the measured value (e.g. glucose = 1.2 g/L), while SPECIMEN represents the physical sample itself (this blood tube, drawn at 08:32, from this anatomic site, of this volume). One specimen typically produces multiple measurements — a single EDTA tube yields ~20 CBC values. SPECIMEN avoids duplicating the collection metadata across all derived measurement rows and lets you store additional collection properties (
specimen_concept_id,quantity,anatomic_site_concept_id,disease_status_concept_id).Proposed ETL convention for INDICATE
MEASUREMENT.measurement_datetimeandSPECIMEN.specimen_datetimeOBSERVATIONrow,observation_concept_id =3043556(Date of analysis of unspecified specimen, LOINC 45353-0)OBSERVATIONrow,observation_concept_id =1175684(Date and time lab result reported, LOINC 90056-3)OBSERVATIONrow,observation_concept_id =42529317(Lab order date, LOINC 82785-7)All target concepts are standard LOINC concepts with
domain_id = Observation.Linkage (v5.4 native mechanism)
MEASUREMENT.measurement_event_id = SPECIMEN.specimen_idandMEASUREMENT.meas_event_field_concept_id =1147822(concept representing theSPECIMENtable).OBSERVATION.observation_event_id = MEASUREMENT.measurement_idandOBSERVATION.obs_event_field_concept_idto the concept representing theMEASUREMENTtable.This is preferred over
FACT_RELATIONSHIP, which was the pre-v5.4 workaround (forum thread). v5.4's *_event_id mechanism supersedes that workaround.References
@MaximMoinat — could you validate this proposed convention as INDICATE's OMOP reference?