Summary
Add identification and classification of Spatial Audio formats so files can be sorted/organized by spatial type. Detection must be safe (no false positives on multichannel surround / binaural / mislabeled files) and accurate (use the most authoritative signal available per format, not just metadata strings).
Architectural note — primary work lives upstream. The detection enums and logic belong in meedya-codecs (MeedyaSuite-core), not duplicated in MeedyaManager. meedya-codecs/src/spatial.rs already defines SpatialAudioFormat with 9 variants. This issue tracks (a) MM-side consumption: classify/rule-engine/UI integration, and (b) gaps in the upstream enum that need separate upstream PRs. See parent epic MWBMPartners/MeedyaSuite-core#9.
Formats in scope
10 spatial audio formats. Status column indicates whether SpatialAudioFormat already covers the format upstream (meedya-codecs::spatial::SpatialAudioFormat).
MPEG-H 3D Audio family (ISO/IEC 23008-3)
| Format |
Sample-entry four-CC |
Container |
Upstream status |
| MPEG-H 3D Audio MHM1 |
mhm1 |
MP4 (broadcast/streaming, multi-frame) |
Coarse: covered by MpegH3d. Sub-variant needs upstream PR. |
| MPEG-H 3D Audio MHA1 |
mha1 |
MP4 (file storage, single-frame) |
Coarse: covered by MpegH3d. Sub-variant needs upstream PR. |
| Sony 360 Reality Audio |
mha1 or mhm1 (sub-profile) |
MP4 |
Covered by Sony360Ra. MUST NOT double-count with MpegH3d. |
Dolby family
| Format |
Sample-entry four-CC |
Container |
Authoritative signal |
Upstream status |
| Dolby Atmos via E-AC-3 JOC |
ec-3 (with dec3) |
MP4/MKV |
joc_flag bit in EC3SpecificBox |
Covered by DolbyAtmos |
| Dolby Atmos via TrueHD |
mlpa |
MP4/MKV |
TrueHD substream Atmos object metadata |
Covered by DolbyAtmos |
| Dolby Atmos via AC-4 |
ac-4 |
MP4/DASH |
AC-4 immersive_audio_presentation tag |
Covered by DolbyAtmos |
DTS family
| Format |
Sample-entry four-CC |
Container |
Authoritative signal |
Upstream status |
| DTS:X |
dtsh |
MP4/MKV/raw DTS |
DTS-X extension substream type byte |
Covered by DtsX |
| IMAX Enhanced DTS:X |
dtsh (with IMAX signaling) |
MP4/MKV |
DTS-X substream + IMAX Enhanced metadata block |
Covered by DtsXImax |
Apple
| Format |
Sample-entry four-CC |
Container |
Authoritative signal |
Upstream status |
| APAC (Apple Positional Audio Codec) |
apac (TBD — verify) |
MP4 (iOS 17+ / AirPods Pro 2) |
Codec four-CC match in stsd. New, sparse public docs. |
NOT IN UPSTREAM — needs new variant on SpatialAudioFormat |
| ASAF (Apple Spatial Audio Format) |
TBD — likely mha1-with-Apple-profile OR distinct atom |
MP4 |
Apple-specific udta / iTunMOVI / extended descriptor atoms over MPEG-H base |
AMBIGUOUS — Apple Music "Spatial Audio" is generally Dolby Atmos via ec-3+JOC. ASAF as a distinct format needs Apple-docs verification. Upstream has AppleSpatial (binaural rendering hint); may NOT be the same thing as ASAF. |
Channel-based immersive
| Format |
Common containers |
Authoritative signal |
Upstream status |
| Auro-3D |
DTS-HD MA with Auro extension OR PCM with 9.1/11.1/13.1 layouts in MKV/MP4 |
Auro-3D extension substream chunk in DTS-HD MA OR explicit channel-mask matching Auro layouts in WAVEFORMATEXTENSIBLE |
Covered by Auro3d |
Open / Cross-vendor
| Format |
Sample-entry four-CC |
Container |
Authoritative signal |
Upstream status |
| Eclipsa Audio (Samsung/Google IAMF) |
iamf |
MP4 (iamd atom) OR standalone .iamf |
iamd IAMF descriptor box presence |
Covered by Iamf (Eclipsa is Samsung's branding for IAMF) |
Scene-based
| Format |
Containers |
Upstream status |
| Ambisonics (FOA, HOA) |
OGG, FLAC, MP4 with periphonic channel layouts |
Covered by Ambisonics |
Detection ladder — apply in order, take the FIRST authoritative match
Per-file detection should use the most authoritative available signal and ignore weaker signals once a strong one matches:
- Bitstream-level signals (most authoritative):
EC3SpecificBox::joc_flag → Dolby Atmos via EAC3
- TrueHD substream Atmos metadata → Dolby Atmos via TrueHD
- AC-4 presentation tag → Dolby Atmos via AC-4
- DTS extension substream tag + IMAX Enhanced metadata → DTS:X (IMAX or not)
- DTS-HD MA Auro extension substream → Auro-3D
- MPEG-H
mhaC / mhmC config profile/level → MPEG-H + 360RA variant
- Container atoms (highly authoritative):
iamd → Eclipsa / IAMF
mha1 / mhm1 sample entry → MPEG-H family
apac sample entry → APAC
- Codec ID strings from MediaInfo / ffprobe — useful for confirmation, NOT primary signal.
- Channel layout analysis — supporting evidence only. Height-channel presence (
Tfl, Tfr, Tfc, Tbl, Tbr, Bfc, etc.) corroborates an immersive format match but is NOT sufficient on its own — a discrete 7.1.4 PCM mix has height channels but is not encoded spatial audio.
- Encoder metadata string (
©too, Encoder, Encoded_Application) — distinguishing Sony 360 Reality Audio from generic MPEG-H, IMAX Enhanced DTS:X from generic DTS:X, Auro-3D from generic DTS-HD MA. Tertiary signal.
- User-defined tags — display-time hint only, NEVER use to override authoritative bitstream signals.
Safety — false positives we must NOT produce
- Multichannel surround ≠ spatial. 5.1 / 7.1 / 9.1 / 11.1 PCM/AAC/AC3 files have many channels but are NOT spatial unless an explicit spatial codec or bitstream flag is present. Auro-3D detection MUST require the Auro extension substream, not just an Auro-shaped channel layout.
- Binaural stereo ≠ spatial.
- Sony 360 RA must NOT also flag as
MpegH3d. If 360RA markers match, emit 360RA and suppress the generic MPEG-H label.
- IMAX Enhanced DTS:X must NOT also flag as plain DTS:X. Same suppression rule.
- Apple Music "Spatial Audio" file ≠ ASAF unless ASAF is confirmed distinct (open question — see Open Questions).
- Trust the bitstream. Metadata claims of Atmos/DTS:X/Auro that lack the bitstream marker are mastering-pipeline artefacts — classify as non-spatial.
Accuracy — confidence levels to expose
For UI / rule-engine use, attach a confidence level per detection:
Confident — bitstream-level signal matched (Detection ladder rung 1)
Likely — container atom matched but bitstream not parsed (rung 2)
Possible — codec ID string from MediaInfo only (rung 3)
Hint — user metadata only (rung 6) — display as informational, don't route in rules
Proposed Changes
1. Use upstream SpatialAudioFormat directly (no MM-local enum)
// in mm-core: import, don't redeclare
use meedya_core::codecs::spatial::SpatialAudioFormat;
Per MeedyaSuite-core#9 parent epic, detection logic lives upstream. MM consumes.
2. Replace MM-local single Spatial MediaQuality variant with per-format granularity
pub enum MediaQuality {
// … existing variants …
Spatial(SpatialAudioFormat),
SpatialLossless(SpatialAudioFormat),
}
pub struct SpatialDetection {
pub format: SpatialAudioFormat,
pub confidence: SpatialConfidence, // Confident / Likely / Possible / Hint
pub source_signal: &'static str, // "joc_flag", "mhaC", "iamd", "auro_substream", "encoder_string" …
}
3. AudioProperties carries the detection
Add to AudioProperties (or a new sibling struct in mm-core::metadata):
pub spatial: Option<SpatialDetection>
- Detection runs as the FINAL step of audio-property extraction.
4. Rule engine + filetype registry
spatial_format rule variable (string against SpatialAudioFormat::to_string()).
is_spatial boolean shortcut.
- Filetype registry:
spatial_capable: true flag with a per-container format list.
5. Test fixtures
For each format covered upstream, an integration test asserting MM classifies the file with the right SpatialAudioFormat. Includes negative-control 7.1.4 PCM file (MUST NOT classify as spatial).
Upstream gaps requiring separate upstream PRs
These items are missing from meedya-codecs::spatial::SpatialAudioFormat and need PRs against MeedyaSuite-core before MM can route on them:
- MPEG-H sub-variants —
MpegH3dMhm1, MpegH3dMha1 (or keep generic MpegH3d with a sub-variant carried in SpatialDetection::source_signal). Filed as upstream issue.
- APAC — Apple Positional Audio Codec — add
Apac variant. Filed as upstream issue.
- ASAF clarification — confirm whether
AppleSpatial covers what the user means by ASAF, or add distinct AppleAsaf variant. Filed as upstream issue.
Open questions / blockers
- ASAF clarification — is ASAF a distinct codec/container atom or marketing for Apple's Atmos-via-EAC3-JOC delivery? Cross-check with Apple developer docs, AVAudioFormat, ProRes RAW Audio container spec.
- APAC four-CC verification — confirm
apac against actual sample files from AirPods Pro 2 / tvOS 17+. The brand is Apple Positional Audio Codec; the four-CC may not be apac.
- DTS:X variant granularity —
DTS:X 1, DTS:X 2 / Pro, and DTS-UHD all map to DtsX (or DtsXImax) currently. Sub-variant is carried in source_signal. Confirm with stakeholder this is the right level.
- IAMF / Eclipsa licensing — IAMF is AOM-licensed (royalty-free). Any decoder dependency must be confirmed compatible with GPL-2.0-or-later.
Affected Files (MM side only — upstream changes tracked in MeedyaSuite-core#9 and the per-gap upstream issues)
crates/mm-core/src/classify/mod.rs — MediaQuality enum + per-format classification
crates/mm-core/src/classify/spatial.rs (new) — detection ladder driving SpatialAudioFormat (consuming upstream)
crates/mm-core/src/metadata/mod.rs — AudioProperties::spatial: Option<SpatialDetection>
crates/mm-core/src/filetype_registry.rs — spatial capability flag per container
crates/mm-core/src/rule_engine/functions.rs — spatial_format / is_spatial variables
crates/mm-core/config/filetypes.json5 — format registry entries
- No MM-side codec parsing — upstream's
meedya-codecs does the heavy lifting via MediaInfo/ffprobe per MeedyaSuite-core#9
Labels
Enhancement, M5-Metadata-Providers, mm-core, needs-research (ASAF / APAC), cross-repo (upstream-blocked)
Summary
Add identification and classification of Spatial Audio formats so files can be sorted/organized by spatial type. Detection must be safe (no false positives on multichannel surround / binaural / mislabeled files) and accurate (use the most authoritative signal available per format, not just metadata strings).
Architectural note — primary work lives upstream. The detection enums and logic belong in
meedya-codecs(MeedyaSuite-core), not duplicated in MeedyaManager.meedya-codecs/src/spatial.rsalready definesSpatialAudioFormatwith 9 variants. This issue tracks (a) MM-side consumption: classify/rule-engine/UI integration, and (b) gaps in the upstream enum that need separate upstream PRs. See parent epic MWBMPartners/MeedyaSuite-core#9.Formats in scope
10 spatial audio formats. Status column indicates whether
SpatialAudioFormatalready covers the format upstream (meedya-codecs::spatial::SpatialAudioFormat).MPEG-H 3D Audio family (ISO/IEC 23008-3)
mhm1MpegH3d. Sub-variant needs upstream PR.mha1MpegH3d. Sub-variant needs upstream PR.mha1ormhm1(sub-profile)Sony360Ra. MUST NOT double-count withMpegH3d.Dolby family
ec-3(withdec3)joc_flagbit inEC3SpecificBoxDolbyAtmosmlpaDolbyAtmosac-4immersive_audio_presentationtagDolbyAtmosDTS family
dtshDtsXdtsh(with IMAX signaling)DtsXImaxApple
apac(TBD — verify)stsd. New, sparse public docs.SpatialAudioFormatmha1-with-Apple-profile OR distinct atomudta/iTunMOVI/ extended descriptor atoms over MPEG-H baseec-3+JOC. ASAF as a distinct format needs Apple-docs verification. Upstream hasAppleSpatial(binaural rendering hint); may NOT be the same thing as ASAF.Channel-based immersive
Auro3dOpen / Cross-vendor
iamfiamdatom) OR standalone.iamfiamdIAMF descriptor box presenceIamf(Eclipsa is Samsung's branding for IAMF)Scene-based
AmbisonicsDetection ladder — apply in order, take the FIRST authoritative match
Per-file detection should use the most authoritative available signal and ignore weaker signals once a strong one matches:
EC3SpecificBox::joc_flag→ Dolby Atmos via EAC3mhaC/mhmCconfig profile/level → MPEG-H + 360RA variantiamd→ Eclipsa / IAMFmha1/mhm1sample entry → MPEG-H familyapacsample entry → APACTfl,Tfr,Tfc,Tbl,Tbr,Bfc, etc.) corroborates an immersive format match but is NOT sufficient on its own — a discrete 7.1.4 PCM mix has height channels but is not encoded spatial audio.©too,Encoder,Encoded_Application) — distinguishing Sony 360 Reality Audio from generic MPEG-H, IMAX Enhanced DTS:X from generic DTS:X, Auro-3D from generic DTS-HD MA. Tertiary signal.Safety — false positives we must NOT produce
MpegH3d. If 360RA markers match, emit 360RA and suppress the generic MPEG-H label.Accuracy — confidence levels to expose
For UI / rule-engine use, attach a confidence level per detection:
Confident— bitstream-level signal matched (Detection ladder rung 1)Likely— container atom matched but bitstream not parsed (rung 2)Possible— codec ID string from MediaInfo only (rung 3)Hint— user metadata only (rung 6) — display as informational, don't route in rulesProposed Changes
1. Use upstream
SpatialAudioFormatdirectly (no MM-local enum)Per MeedyaSuite-core#9 parent epic, detection logic lives upstream. MM consumes.
2. Replace MM-local single
SpatialMediaQualityvariant with per-format granularity3. AudioProperties carries the detection
Add to
AudioProperties(or a new sibling struct in mm-core::metadata):pub spatial: Option<SpatialDetection>4. Rule engine + filetype registry
spatial_formatrule variable (string againstSpatialAudioFormat::to_string()).is_spatialboolean shortcut.spatial_capable: trueflag with a per-container format list.5. Test fixtures
For each format covered upstream, an integration test asserting MM classifies the file with the right
SpatialAudioFormat. Includes negative-control 7.1.4 PCM file (MUST NOT classify as spatial).Upstream gaps requiring separate upstream PRs
These items are missing from
meedya-codecs::spatial::SpatialAudioFormatand need PRs against MeedyaSuite-core before MM can route on them:MpegH3dMhm1,MpegH3dMha1(or keep genericMpegH3dwith a sub-variant carried inSpatialDetection::source_signal). Filed as upstream issue.Apacvariant. Filed as upstream issue.AppleSpatialcovers what the user means by ASAF, or add distinctAppleAsafvariant. Filed as upstream issue.Open questions / blockers
apacagainst actual sample files from AirPods Pro 2 / tvOS 17+. The brand is Apple Positional Audio Codec; the four-CC may not beapac.DTS:X 1,DTS:X 2 / Pro, andDTS-UHDall map toDtsX(orDtsXImax) currently. Sub-variant is carried insource_signal. Confirm with stakeholder this is the right level.Affected Files (MM side only — upstream changes tracked in MeedyaSuite-core#9 and the per-gap upstream issues)
crates/mm-core/src/classify/mod.rs—MediaQualityenum + per-format classificationcrates/mm-core/src/classify/spatial.rs(new) — detection ladder drivingSpatialAudioFormat(consuming upstream)crates/mm-core/src/metadata/mod.rs—AudioProperties::spatial: Option<SpatialDetection>crates/mm-core/src/filetype_registry.rs— spatial capability flag per containercrates/mm-core/src/rule_engine/functions.rs—spatial_format/is_spatialvariablescrates/mm-core/config/filetypes.json5— format registry entriesmeedya-codecsdoes the heavy lifting via MediaInfo/ffprobe per MeedyaSuite-core#9Labels
Enhancement, M5-Metadata-Providers, mm-core, needs-research (ASAF / APAC), cross-repo (upstream-blocked)