Add import_phoenix_ast() for BD Phoenix instrument AST data by efosternyarko · Pull Request #68 · AMRverse/AMRgen

efosternyarko · 2026-03-03T17:24:59Z

Summary

This PR adds import_phoenix_ast(), a new import function for antimicrobial susceptibility testing (AST) data exported from BD Phoenix instruments, and registers it in the import_ast() dispatcher as format = "phoenix".

Three BD Phoenix export formats are supported, with automatic detection from file extension and content:

long_german — XLS export (no header, 7 fixed columns, German decimal locale). Handles site-specific drug name suffixes ((f), (u)), testing additives (mit G6P), synergy screens (-Syn), high-concentration tests (Hohe X Konzentration), X SIR values (no breakpoint → NA), and DD.MM.YYYY date parsing.
long_clsi — Per-isolate CLSI instrument report (TXT/TSV, named headers: Antimicrobial, MIC or Concentration, Interp, Expert (SIR), Final (SIR)). Trailing Resistance Markers and Expert Triggered Rules sections are automatically stripped. Sample ID is derived from the filename.
wide — Wide-format XLSX with alternating [Drug] call / [Drug] MIC column pairs (e.g. as in Mills et al. 2022, Genome Medicine). Embedded whitespace (tabs) in column names is normalised; Unicode ≤ signs are converted to <=.

A shared internal .clean_mic() helper normalises MIC strings across all three formats (Unicode ≤ → <=, German decimal comma → period, combination drug denominator stripping e.g. >32/2 → >32).

New function parameters: format, sample_col, instrument_guideline

DESCRIPTION: adds readxl (>= 1.4.0) to Imports.

Test plan

import_phoenix_ast("Phoenix-Antibiogramm-Daten.xls") — auto-detects long_german, parses German MIC locale, strips drug name suffixes, returns correct pheno_provided from expertized column
import_phoenix_ast("TF-BDP_CLSI2018.txt", species = "Escherichia coli", instrument_guideline = "CLSI 2018") — auto-detects long_clsi, strips trailing sections, uses Final (SIR) as authoritative call
import_phoenix_ast("AST_MIC_Mills.xlsx", sample_col = "Sample", species = "Escherichia coli") — auto-detects wide, pivots call/MIC pairs, normalises ≤, strips embedded tabs from column names
import_ast(input, format = "phoenix") dispatches correctly to import_phoenix_ast()
interpret_eucast = TRUE and interpret_ecoff = TRUE work correctly for all three sub-formats

katholt · 2026-03-04T08:33:22Z

Need to generalise this further please
Also I don't think the supp table from Mills 2022 is actually in any format exported from BD Phoenix, it is presumably processed to generate a wide format so is not representative of instrument export.

Adds a new import function supporting three BD Phoenix export formats, auto-detected from file extension and content: - long_german: XLS with no header row, 7 fixed columns, German decimal locale (comma separator). Handles site-specific suffixes (e.g. "(f)"), testing additives ("mit G6P"), synergy screens ("-Syn"), high- concentration tests ("Hohe X Konzentration"), and DD.MM.YYYY dates. "X" SIR values (no breakpoint defined) are treated as NA. - long_clsi: Per-isolate CLSI report TXT/TSV with named column headers (Antimicrobial, MIC or Concentration, Interp, Expert (SIR), Final (SIR)). Trailing Resistance Markers and Expert Triggered Rules sections are automatically stripped. Sample ID is derived from the filename. - wide: Wide-format XLSX with alternating [Drug] call / [Drug] MIC column pairs (e.g. Mills et al. 2022). Embedded whitespace in column names (tabs) is normalised. Unicode ≤ signs are converted to <=. A shared .clean_mic() helper normalises MIC strings across all formats: Unicode ≤ → <=, German decimal comma → period, combination denominator stripping (>32/2 → >32). New parameters: format ("auto"/"long_german"/"long_clsi"/"wide"), sample_col, instrument_guideline. Also adds format = "phoenix" dispatch to import_ast(), and adds readxl (>= 1.4.0) to DESCRIPTION Imports.

Replace specific private data filenames in @examples and internal comments with generic placeholder names to avoid exposing private data.

Remove format-specific modes (long_german, long_clsi, wide) and the wide format entirely (not a Phoenix export format). Replace with generic column detection: auto-detects drug/MIC/SIR/sample/species columns by common Phoenix header name patterns, with positional fallback for headerless XLS exports (col 1=sample, 2=species, 3=drug, 4=MIC, 5=instrument SIR, 6=expert SIR). Any column can be overridden by name or index via drug_col, mic_col, sir_col, sample_col, species_col. Drug name normalisation is delegated to as.ab() throughout.

WHONET column values can contain raw measurements (MIC strings or zone sizes) as well as SIR letters. Parse sir_value as as.mic() for broth dilution/Etest columns and as as.disk() for disk diffusion columns, and include mic/disk in the relocate output order. All other import functions already applied the full set of AMR classes (as.ab, as.mic, as.disk, as.sir, as.mo).

efosternyarko added 4 commits March 4, 2026 09:31

Remove private data filenames from import_phoenix_ast examples

c9ebdd0

Replace specific private data filenames in @examples and internal comments with generic placeholder names to avoid exposing private data.

efosternyarko force-pushed the feature/import-phoenix-ast branch from 20b5796 to d5e9bfb Compare March 4, 2026 09:53

efosternyarko and others added 2 commits March 4, 2026 10:08

Replace non-ASCII characters with ASCII equivalents in import_pheno.R

467478b

created docs

779ba66

katholt merged commit c7249ae into AMRverse:main Mar 6, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add import_phoenix_ast() for BD Phoenix instrument AST data#68

Add import_phoenix_ast() for BD Phoenix instrument AST data#68
katholt merged 6 commits intoAMRverse:mainfrom
efosternyarko:feature/import-phoenix-ast

efosternyarko commented Mar 3, 2026

Uh oh!

katholt commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

efosternyarko commented Mar 3, 2026

Summary

Test plan

Uh oh!

katholt commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants