Skip to content

feat: Add validation for duplicate SAMPLE_IDs in timeline files#71

Open
VarshiniGunti wants to merge 1 commit into
cBioPortal:masterfrom
VarshiniGunti:feature/10917-duplicate-sample-id-validation
Open

feat: Add validation for duplicate SAMPLE_IDs in timeline files#71
VarshiniGunti wants to merge 1 commit into
cBioPortal:masterfrom
VarshiniGunti:feature/10917-duplicate-sample-id-validation

Conversation

@VarshiniGunti

@VarshiniGunti VarshiniGunti commented Feb 20, 2026

Copy link
Copy Markdown

Fixes: #10917

Summary
Adds validation to prevent duplicate SAMPLE_IDs within Sample Acquisition and Sequencing timeline tracks during data import.

Background
Timeline files currently allow multiple entries with the same SAMPLE_ID in a single track. This leads to ambiguous and incorrect rendering in the patient timeline view, where events may appear duplicated or overlapping for the same sample.

Changes

  • Added duplicate detection logic to TimelineValidator
  • Track SAMPLE_IDs per timeline track during validation
  • Emit a clear validation error when a duplicate is detected, including:
  • The duplicated SAMPLE_ID
  • The affected timeline track
  • Validation continues after logging the error, consistent with existing validator behavior

Tests

  • Added unit tests covering:
  • Detection of duplicate SAMPLE_IDs in timeline files
  • Successful validation when all SAMPLE_IDs are unique

Notes

  • Python validator only
  • No changes to import flow or downstream processing
  • No new dependencies introduced
Screenshot 2026-02-20 221948

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add validation for duplicate SAMPLE_IDs in timeline

1 participant