Skip to content

Add Spatial Audio detection (MPEG-H MHM1/MHA1, Dolby Atmos, DTS:X, 360 Reality Audio, ASAF, APAC, Eclipsa) #131

@Salem874

Description

@Salem874

Summary

Add identification and classification of Spatial Audio formats so files can be sorted/organized by spatial type. Detection must be safe (no false positives on multichannel surround / binaural / mislabeled files) and accurate (use the most authoritative signal available per format, not just metadata strings).

Architectural note — primary work lives upstream. The detection enums and logic belong in meedya-codecs (MeedyaSuite-core), not duplicated in MeedyaManager. meedya-codecs/src/spatial.rs already defines SpatialAudioFormat with 9 variants. This issue tracks (a) MM-side consumption: classify/rule-engine/UI integration, and (b) gaps in the upstream enum that need separate upstream PRs. See parent epic MWBMPartners/MeedyaSuite-core#9.

Formats in scope

10 spatial audio formats. Status column indicates whether SpatialAudioFormat already covers the format upstream (meedya-codecs::spatial::SpatialAudioFormat).

MPEG-H 3D Audio family (ISO/IEC 23008-3)

Format Sample-entry four-CC Container Upstream status
MPEG-H 3D Audio MHM1 mhm1 MP4 (broadcast/streaming, multi-frame) Coarse: covered by MpegH3d. Sub-variant needs upstream PR.
MPEG-H 3D Audio MHA1 mha1 MP4 (file storage, single-frame) Coarse: covered by MpegH3d. Sub-variant needs upstream PR.
Sony 360 Reality Audio mha1 or mhm1 (sub-profile) MP4 Covered by Sony360Ra. MUST NOT double-count with MpegH3d.

Dolby family

Format Sample-entry four-CC Container Authoritative signal Upstream status
Dolby Atmos via E-AC-3 JOC ec-3 (with dec3) MP4/MKV joc_flag bit in EC3SpecificBox Covered by DolbyAtmos
Dolby Atmos via TrueHD mlpa MP4/MKV TrueHD substream Atmos object metadata Covered by DolbyAtmos
Dolby Atmos via AC-4 ac-4 MP4/DASH AC-4 immersive_audio_presentation tag Covered by DolbyAtmos

DTS family

Format Sample-entry four-CC Container Authoritative signal Upstream status
DTS:X dtsh MP4/MKV/raw DTS DTS-X extension substream type byte Covered by DtsX
IMAX Enhanced DTS:X dtsh (with IMAX signaling) MP4/MKV DTS-X substream + IMAX Enhanced metadata block Covered by DtsXImax

Apple

Format Sample-entry four-CC Container Authoritative signal Upstream status
APAC (Apple Positional Audio Codec) apac (TBD — verify) MP4 (iOS 17+ / AirPods Pro 2) Codec four-CC match in stsd. New, sparse public docs. NOT IN UPSTREAM — needs new variant on SpatialAudioFormat
ASAF (Apple Spatial Audio Format) TBD — likely mha1-with-Apple-profile OR distinct atom MP4 Apple-specific udta / iTunMOVI / extended descriptor atoms over MPEG-H base AMBIGUOUS — Apple Music "Spatial Audio" is generally Dolby Atmos via ec-3+JOC. ASAF as a distinct format needs Apple-docs verification. Upstream has AppleSpatial (binaural rendering hint); may NOT be the same thing as ASAF.

Channel-based immersive

Format Common containers Authoritative signal Upstream status
Auro-3D DTS-HD MA with Auro extension OR PCM with 9.1/11.1/13.1 layouts in MKV/MP4 Auro-3D extension substream chunk in DTS-HD MA OR explicit channel-mask matching Auro layouts in WAVEFORMATEXTENSIBLE Covered by Auro3d

Open / Cross-vendor

Format Sample-entry four-CC Container Authoritative signal Upstream status
Eclipsa Audio (Samsung/Google IAMF) iamf MP4 (iamd atom) OR standalone .iamf iamd IAMF descriptor box presence Covered by Iamf (Eclipsa is Samsung's branding for IAMF)

Scene-based

Format Containers Upstream status
Ambisonics (FOA, HOA) OGG, FLAC, MP4 with periphonic channel layouts Covered by Ambisonics

Detection ladder — apply in order, take the FIRST authoritative match

Per-file detection should use the most authoritative available signal and ignore weaker signals once a strong one matches:

  1. Bitstream-level signals (most authoritative):
    • EC3SpecificBox::joc_flag → Dolby Atmos via EAC3
    • TrueHD substream Atmos metadata → Dolby Atmos via TrueHD
    • AC-4 presentation tag → Dolby Atmos via AC-4
    • DTS extension substream tag + IMAX Enhanced metadata → DTS:X (IMAX or not)
    • DTS-HD MA Auro extension substream → Auro-3D
    • MPEG-H mhaC / mhmC config profile/level → MPEG-H + 360RA variant
  2. Container atoms (highly authoritative):
    • iamd → Eclipsa / IAMF
    • mha1 / mhm1 sample entry → MPEG-H family
    • apac sample entry → APAC
  3. Codec ID strings from MediaInfo / ffprobe — useful for confirmation, NOT primary signal.
  4. Channel layout analysis — supporting evidence only. Height-channel presence (Tfl, Tfr, Tfc, Tbl, Tbr, Bfc, etc.) corroborates an immersive format match but is NOT sufficient on its own — a discrete 7.1.4 PCM mix has height channels but is not encoded spatial audio.
  5. Encoder metadata string (©too, Encoder, Encoded_Application) — distinguishing Sony 360 Reality Audio from generic MPEG-H, IMAX Enhanced DTS:X from generic DTS:X, Auro-3D from generic DTS-HD MA. Tertiary signal.
  6. User-defined tags — display-time hint only, NEVER use to override authoritative bitstream signals.

Safety — false positives we must NOT produce

  1. Multichannel surround ≠ spatial. 5.1 / 7.1 / 9.1 / 11.1 PCM/AAC/AC3 files have many channels but are NOT spatial unless an explicit spatial codec or bitstream flag is present. Auro-3D detection MUST require the Auro extension substream, not just an Auro-shaped channel layout.
  2. Binaural stereo ≠ spatial.
  3. Sony 360 RA must NOT also flag as MpegH3d. If 360RA markers match, emit 360RA and suppress the generic MPEG-H label.
  4. IMAX Enhanced DTS:X must NOT also flag as plain DTS:X. Same suppression rule.
  5. Apple Music "Spatial Audio" file ≠ ASAF unless ASAF is confirmed distinct (open question — see Open Questions).
  6. Trust the bitstream. Metadata claims of Atmos/DTS:X/Auro that lack the bitstream marker are mastering-pipeline artefacts — classify as non-spatial.

Accuracy — confidence levels to expose

For UI / rule-engine use, attach a confidence level per detection:

  • Confident — bitstream-level signal matched (Detection ladder rung 1)
  • Likely — container atom matched but bitstream not parsed (rung 2)
  • Possible — codec ID string from MediaInfo only (rung 3)
  • Hint — user metadata only (rung 6) — display as informational, don't route in rules

Proposed Changes

1. Use upstream SpatialAudioFormat directly (no MM-local enum)

// in mm-core: import, don't redeclare
use meedya_core::codecs::spatial::SpatialAudioFormat;

Per MeedyaSuite-core#9 parent epic, detection logic lives upstream. MM consumes.

2. Replace MM-local single Spatial MediaQuality variant with per-format granularity

pub enum MediaQuality {
    // … existing variants …
    Spatial(SpatialAudioFormat),
    SpatialLossless(SpatialAudioFormat),
}

pub struct SpatialDetection {
    pub format: SpatialAudioFormat,
    pub confidence: SpatialConfidence,    // Confident / Likely / Possible / Hint
    pub source_signal: &'static str,      // "joc_flag", "mhaC", "iamd", "auro_substream", "encoder_string" …
}

3. AudioProperties carries the detection

Add to AudioProperties (or a new sibling struct in mm-core::metadata):

  • pub spatial: Option<SpatialDetection>
  • Detection runs as the FINAL step of audio-property extraction.

4. Rule engine + filetype registry

  • spatial_format rule variable (string against SpatialAudioFormat::to_string()).
  • is_spatial boolean shortcut.
  • Filetype registry: spatial_capable: true flag with a per-container format list.

5. Test fixtures

For each format covered upstream, an integration test asserting MM classifies the file with the right SpatialAudioFormat. Includes negative-control 7.1.4 PCM file (MUST NOT classify as spatial).

Upstream gaps requiring separate upstream PRs

These items are missing from meedya-codecs::spatial::SpatialAudioFormat and need PRs against MeedyaSuite-core before MM can route on them:

  1. MPEG-H sub-variantsMpegH3dMhm1, MpegH3dMha1 (or keep generic MpegH3d with a sub-variant carried in SpatialDetection::source_signal). Filed as upstream issue.
  2. APAC — Apple Positional Audio Codec — add Apac variant. Filed as upstream issue.
  3. ASAF clarification — confirm whether AppleSpatial covers what the user means by ASAF, or add distinct AppleAsaf variant. Filed as upstream issue.

Open questions / blockers

  1. ASAF clarification — is ASAF a distinct codec/container atom or marketing for Apple's Atmos-via-EAC3-JOC delivery? Cross-check with Apple developer docs, AVAudioFormat, ProRes RAW Audio container spec.
  2. APAC four-CC verification — confirm apac against actual sample files from AirPods Pro 2 / tvOS 17+. The brand is Apple Positional Audio Codec; the four-CC may not be apac.
  3. DTS:X variant granularityDTS:X 1, DTS:X 2 / Pro, and DTS-UHD all map to DtsX (or DtsXImax) currently. Sub-variant is carried in source_signal. Confirm with stakeholder this is the right level.
  4. IAMF / Eclipsa licensing — IAMF is AOM-licensed (royalty-free). Any decoder dependency must be confirmed compatible with GPL-2.0-or-later.

Affected Files (MM side only — upstream changes tracked in MeedyaSuite-core#9 and the per-gap upstream issues)

  • crates/mm-core/src/classify/mod.rsMediaQuality enum + per-format classification
  • crates/mm-core/src/classify/spatial.rs (new) — detection ladder driving SpatialAudioFormat (consuming upstream)
  • crates/mm-core/src/metadata/mod.rsAudioProperties::spatial: Option<SpatialDetection>
  • crates/mm-core/src/filetype_registry.rs — spatial capability flag per container
  • crates/mm-core/src/rule_engine/functions.rsspatial_format / is_spatial variables
  • crates/mm-core/config/filetypes.json5 — format registry entries
  • No MM-side codec parsing — upstream's meedya-codecs does the heavy lifting via MediaInfo/ffprobe per MeedyaSuite-core#9

Labels

Enhancement, M5-Metadata-Providers, mm-core, needs-research (ASAF / APAC), cross-repo (upstream-blocked)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions