Skip to content

[Question] DOCA Flow CT IPv6 on standalone ConnectX-7: format_select_dw_8_6_ext gating mechanism and roadmap? #12

@maxpain

Description

@maxpain

Summary

The DOCA Flow Connection Tracking (CT) pipe with IPv6 5-tuple fails on all standalone ConnectX-7 SKUs because the underlying capability HCA_CAP_GENERAL_2.format_select_dw_8_6_ext (also known as full_dw_jumbo_support in the DPDK mlx5 driver) is reported as 0 by the firmware — both in GET_CUR and GET_MAX modes.

The official documentation states "NVIDIA BlueField-3 and above is required to support IPv6" for DOCA Flow CT. We would like to understand whether this is a permanent product positioning decision or whether there is a path forward for standalone ConnectX-7 deployments.

Environment

We are building a commercial IaaS+PaaS cloud SDN data plane using DOCA Flow on ConnectX-7 (IPIP encap/decap, LPM, CT). We need IPv6 Connection Tracking support.

Tested SKUs:

Card PSID Description
MCX75310AAS-NEA_Ax MT_0000000838 400G NDR OSFP single-port, Crypto Disabled
MCX713106AS-VEA_Ax MT_0000000840 200GbE dual-port QSFP112, Crypto Disabled
MCX755206AS-NEA_Ax MT_0000000892 DGX storage 2×200G/400G IB VPI

All on firmware 28.48.1000 (latest available).

Verification script using mlx5dv_devx_general_cmd with MLX5_CMD_OP_QUERY_HCA_CAP (opcode 0x100), op_mod = (MLX5_CAP_GENERAL_2 << 1) | GET_MAX:

Device: mlx5_0
format_select_dw_8_6_ext = 0  (no jumbo STE; HWS falls back to 8-DW + limited DW)
HCA_CAP_GENERAL_2 (op_mod = 0x20 | GET_MAX)
  flow_table_type_2_type           (bit 0x1a0, w8 ) = 0x3
  format_select_dw_8_6_ext         (bit 0x1aa, w1 ) = 0x0
  log_min_mkey_entity_size         (bit 0x1ab, w5 ) = 0x9
  ...

This blocks DOCA Flow CT pipe from accepting IPv6 entries because the IPv6 5-tuple key does not fit into the 8-DW STE format and the 11-DW (jumbo) format is not advertised by the firmware.

What we have already investigated

We have done substantial due diligence before opening this issue:

  1. All 24 plausible NV-config variables tested with mlxconfig set + mlxfwreset -l 3 reset + re-query. Confirmed applied (Current column matches Next Boot). None affect format_select_dw_8_6_ext. Includes: ICM_CACHE_MODE=LARGE_SCALE_STEERING, STEERING_CACHE_REFRESH, FLEX_PARSER_PROFILE_ENABLE, LAG_RESOURCE_ALLOCATION=PRE_ALLOCATION, MULTI_PORT_VHCA_EN, MKEY_BY_NAME, UCTX_EN, RDMA_SELECTIVE_REPEAT_EN, ROCE_ADAPTIVE_ROUTING_EN, ATS_ENABLED, SRIOV_EN, NUM_OF_VFS, REAL_TIME_CLOCK_ENABLE, and others.
  2. All op_mod values 0x00–0x3f probed for hidden capability classes via QUERY_HCA_CAP. No hidden class with this bit.
  3. SET_HCA_CAP from userspace with cap_class=0x20 returns EINVAL (kernel mlx5 driver filters it).
  4. Direct CR-space access via mstmcra returns 0xbadacce6 (privileged access blocked on standalone CX-7).
  5. mstprivhost reports "only BlueField devices are supported" — confirming the architectural separation.
  6. mstconfig token_supported returns: CS, DBG, CRCS, CRDT all supported. CRCS challenge generated and ready to be signed if NVIDIA can provide a token that exposes the capability.
  7. All 33 public CX-7 PSIDs in the mlnx-fw-updater bundle (HCA type MT4129, firmware 28.48.1000) share byte-identical MAIN_CODE sections (15 MB of common code from offset 0xf10000 to 0xff0000) — meaning the gating is not PSID-dependent in the public firmware tree.

Questions

  1. Is there a roadmap to enable IPv6 CT pipe (format_select_dw_8_6_ext = 1) on standalone ConnectX-7 SKUs in a future firmware release?

  2. Is this a hardware-level limitation (eFuse / silicon strap set at fabrication)? Or is it a firmware product-segmentation decision that could be revised?

  3. Is there a documented mechanism (signed CS / CRCS / DBG token) by which NVIDIA Engineering can expose this capability on standalone ConnectX-7 for evaluation purposes? We have generated CRCS challenge tokens and are ready to share them.

  4. What is the official recommended workaround for production commercial SDN data planes that need stateful IPv6 connection tracking on existing ConnectX-7 deployments, short of migrating hardware to BlueField-3 / ConnectX-8?

  5. For our specific use case (IPv4 + IPv6 stateful firewall / NAT / VPC isolation), is there a DOCA Flow API pattern that splits the IPv6 5-tuple match into multiple 8-DW STE entries while preserving CT-pipe-level semantics (aging, origin/reply linking, ASO counters)?

Why we are opening this here

We do not yet have a paid NVIDIA support subscription (the purchasing process is in progress). The official forums have low response rates. We are hoping that opening a public technical issue on the NVIDIA-DOCA repository will reach the right engineering contact who can either give us an authoritative answer or route us internally.

We are happy to:

  • Share mstconfig query, mstflint query full, lspci -vv, full HCA_CAP_GENERAL_2 dumps from each SKU
  • Share the CRCS challenge tokens for SKU evaluation
  • Provide more context about our DOCA Flow data plane architecture

Any pointers to documentation, internal product-management contact, or roadmap visibility would be appreciated.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions