[linux-nvidia-6.17-next] Add CXL Type-2 device support and CXL RAS error handling by JiandiAnNVIDIA · Pull Request #323 · NVIDIA/NV-Kernels

JiandiAnNVIDIA · 2026-02-17T05:55:40Z

Description

This patch series adds comprehensive CXL (Compute Express Link) support to the nvidia-6.17 kernel, including:

CXL Type-2 device support - Enables accelerator devices (like GPUs and SmartNICs) to use CXL for coherent memory access
CXL RAS (Reliability, Availability, Serviceability) error handling - Implements PCIe Port Protocol error handling and logging for CXL devices
Prerequisite CXL driver updates - Cherry-picked commits from Linux v6.18 that are required dependencies

Key Features Added:

CXL Type-2 accelerator device registration and memory management
CXL region creation by Type-2 drivers
DPA (Device Physical Address) allocation interface for accelerators
HPA (Host Physical Address) free space enumeration
CXL protocol error detection, forwarding, and recovery
RAS register mapping for CXL Endpoints and Switch Ports

Justification

CXL Type-2 device support is critical for next-generation NVIDIA accelerators and data center workloads:

Enables coherent memory sharing between CPUs and accelerators
Supports firmware-provisioned CXL regions for accelerator memory
Provides proper error handling and reporting for CXL fabric errors
Required for upcoming NVIDIA hardware with CXL capabilities

Source

Patch Breakdown (80 commits total):

Category	Count	Source
v6.18 CXL driver prerequisites	27	Upstream (cherry-picked from torvalds/linux v6.18)
Terry Bowman's CXL RAS series	25	Upstream (RESEND v13)
Alejandro Lucero's Type-2 series	25	Upstream (v22)
ACPI IORT workaround	1	OOT (NVIDIA-specific)
Revert old CXL reset	1	OOT (cleanup)
Config update	1	OOT (build config)

Lore Links:

Terry Bowman's CXL RAS series (RESEND v13):
https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/
Alejandro Lucero's CXL Type-2 series (v22):
https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/

Upstream Status:

Terry Bowman's patches: Under active review, expected to merge in v6.19
Alejandro Lucero's patches: Under active review, expected to merge in v6.19/v6.20
v6.18 cherry-picks: Already merged upstream
ACPI IORT workaround: NVIDIA-specific, will evaluate upstream submission

Testing

Build Validation:

✅ Built successfully for ARM64 4K page size kernel
✅ Built successfully for ARM64 64K page size kernel

Config Verification:

All CXL configs enabled as expected:

CONFIG_CXL_BUS=y
CONFIG_CXL_PCI=y
CONFIG_CXL_MEM=y
CONFIG_CXL_PORT=y
CONFIG_CXL_REGION=y
CONFIG_CXL_RAS=y
CONFIG_CXL_FEATURES=y
CONFIG_SFC_CXL=y
CONFIG_PCIEAER_CXL=y
CONFIG_CXL_ACPI=m
CONFIG_CXL_PMEM=m
CONFIG_CXL_PMU=m
CONFIG_DEV_DAX_CXL=m

Runtime Testing:

Boot test on ARM64 system
CXL device enumeration test
CXL region creation test

Notes

CONFIG_CXL_BUS and CONFIG_CXL_PCI changed from tristate to bool by the Type-2 patches (intentional design change for built-in CXL support)
Kernel config annotations updated in debian.nvidia-6.17/config/annotations to reflect these changes

This reverts commit 0e06082. Signed-off-by: Jiandi An <jan@nvidia.com>

Use the string choice helper function str_plural() to simplify the code. Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250811122519.543554-1-zhao.xichao@vivo.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 22fb4ad) Signed-off-by: Jiandi An <jan@nvidia.com>

Replace ternary operator with str_enabled_disabled() helper to enhance code readability and consistency. [dj: Fix spelling in commit log and subject. ] Signed-off-by: Nai-Chen Cheng <bleach1827@gmail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250812-cxl-region-string-choices-v1-1-50200b0bc782@gmail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 733c4e9) Signed-off-by: Jiandi An <jan@nvidia.com>

The root decoder's HPA to SPA translation logic was implemented using a single function pointer. In preparation for additional per-decoder callbacks, convert this into a struct cxl_rd_ops and move the hpa_to_spa pointer into it. To avoid maintaining a static ops instance populated with mostly NULL pointers, allocate the ops structure dynamically only when a platform requires overrides (e.g. XOR interleave decoding). The setup can be extended as additional callbacks are added. Co-developed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/818530c82c351a9c0d3a204f593068dd2126a5a9.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 524b2b7) Signed-off-by: Jiandi An <jan@nvidia.com>

When DPA->SPA translation was introduced, it included a helper that applied the XOR maps to do the CXL HPA -> SPA translation for XOR region interleaves. In preparation for adding SPA->DPA address translation, introduce the reverse callback. The root decoder callback is defined generically and not all usages may be self inverting like this XOR function. Add another root decoder callback that is the spa_to_hpa function. Update the existing cxl_xor_hpa_to_spa() with a name that reflects what it does without directionality: cxl_apply_xor_maps(), a generic parameter: addr replaces hpa, and code comments stating that the function supports the translation in either direction. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/79d9d72230c599cae94d7221781ead6392ae6d3f.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit b83ee96) Signed-off-by: Jiandi An <jan@nvidia.com>

Add infrastructure to translate System Physical Addresses (SPA) to Device Physical Addresses (DPA) within CXL regions. This capability will be used by follow-on patches that add poison inject and clear operations at the region level. The SPA-to-DPA translation process follows these steps: 1. Apply root decoder transformations (SPA to HPA) if configured. 2. Extract the position in region interleave from the HPA offset. 3. Extract the DPA offset from the HPA offset. 4. Use position to find endpoint decoder. 5. Use endpoint decoder to find memdev and calculate DPA from offset. 6. Return the result - a memdev and a DPA. It is Step 1 above that makes this a driver level operation and not work we can push to user space. Rather than exporting the XOR maps for root decoders configured with XOR interleave, the driver performs this complex calculation for the user. Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13 Implementation Note: Device Decode Logic. These calculations mirror much of the logic introduced earlier in DPA to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to reverse the spec defined 'Device Decode Logic'. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/422f0e27742c6ca9a11f7cd83e6ba9fa1a8d0c74.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit dc18117) Signed-off-by: Jiandi An <jan@nvidia.com>

The core functions that validate and send inject and clear commands to the memdev devices require holding both the dpa_rwsem and the region_rwsem. In preparation for another caller of these functions that must hold the locks upon entry, split the work into a locked and unlocked pair. Consideration was given to moving the locking to both callers, however, the existing caller is not in the core (mem.c) and cannot access the locks. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/1d601f586975195733984ca63d1b5789bbe8690f.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 25a0207) Signed-off-by: Jiandi An <jan@nvidia.com>

Add CXL region debugfs attributes to inject and clear poison based on an offset into the region. These new interfaces allow users to operate on poison at the region level without needing to resolve Device Physical Addresses (DPA) or target individual memdevs. The implementation uses a new helper, region_offset_to_dpa_result() that applies decoder interleave logic, including XOR-based address decoding when applicable. Note that XOR decodes rely on driver internal xormaps which are not exposed to userspace. So, this support is not only a simplification of poison operations that could be done using existing per memdev operations, but also it enables this functionality for XOR interleaved regions for the first time. New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/: inject_poison and clear_poison. These are only exposed if all memdevs participating in the region support both inject and clear commands, ensuring consistent and reliable behavior across multi-device regions. If tracing is enabled, these operations are logged as cxl_poison events in /sys/kernel/tracing/trace. The ABI documentation warns users of the significant risks that come with using these capabilities. A CXL Maturity Map update shows this user flow is now supported. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/f3fd8628ab57ea79704fb2d645902cd499c066af.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit c3dd676) Signed-off-by: Jiandi An <jan@nvidia.com>

…fset() 0day reported warnings of: drivers/cxl/core/region.c:3664:25: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] drivers/cxl/core/region.c:3671:37: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] Replace %#llx with %pr to emit resource_size_t arguments. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508160513.NAZ9i9rQ-lkp@intel.com/ Cc: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250818153953.3658952-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit e6a9530) Signed-off-by: Jiandi An <jan@nvidia.com>

Add clarification to comment for memory hotplug callback ordering as the current comment does not provide clear language on which callback happens first. Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 6512886) Signed-off-by: Jiandi An <jan@nvidia.com>

Add helper function node_update_perf_attrs() to allow update of node access coordinates computed by an external agent such as CXL. The helper allows updating of coordinates after the attribute being created by HMAT. Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit b57fc65) Signed-off-by: Jiandi An <jan@nvidia.com>

…ough HMAT The current implementation of CXL memory hotplug notifier gets called before the HMAT memory hotplug notifier. The CXL driver calculates the access coordinates (bandwidth and latency values) for the CXL end to end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL memory hotplug notifier writes the access coordinates to the HMAT target structs. Then the HMAT memory hotplug notifier is called and it creates the access coordinates for the node sysfs attributes. During testing on an Intel platform, it was found that although the newly calculated coordinates were pushed to sysfs, the sysfs attributes for the access coordinates showed up with the wrong initiator. The system has 4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are CXL nodes. The expectation is that node 2 would show up as a target to node 0: /sys/devices/system/node/node2/access0/initiators/node0 However it was observed that node 2 showed up as a target under node 1: /sys/devices/system/node/node2/access0/initiators/node1 The original intent of the 'ext_updated' flag in HMAT handling code was to stop HMAT memory hotplug callback from clobbering the access coordinates after CXL has injected its calculated coordinates and replaced the generic target access coordinates provided by the HMAT table in the HMAT target structs. However the flag is hacky at best and blocks the updates from other CXL regions that are onlined in the same node later on. Remove the 'ext_updated' flag usage and just update the access coordinates for the nodes directly without touching HMAT target data. The hotplug memory callback ordering is changed. Instead of changing CXL, move HMAT back so there's room for the levels rather than have CXL share the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL callback to be executed after the HMAT callback. With the change, the CXL hotplug memory notifier runs after the HMAT callback. The HMAT callback will create the node sysfs attributes for access coordinates. The CXL callback will write the access coordinates to the now created node sysfs attributes directly and will not pollute the HMAT target values. A nodemask is introduced to keep track if a node has been updated and prevents further updates. Fixes: 067353a ("cxl/region: Add memory hotplug notifier for cxl region") Cc: stable@vger.kernel.org Tested-by: Marc Herbert <marc.herbert@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 2e454fb) Signed-off-by: Jiandi An <jan@nvidia.com>

Remove deadcode since CXL no longer calls hmat_update_target_coordinates(). Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit e99ecbc) Signed-off-by: Jiandi An <jan@nvidia.com>

Fixed the following typo errors intersparsed ==> interspersed in Documentation/driver-api/cxl/platform/bios-and-efi.rst Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20250818175335.5312-1-rakuram.e96@gmail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit a414408) Signed-off-by: Jiandi An <jan@nvidia.com>

ACPICA commit 710745713ad3a2543dbfb70e84764f31f0e46bdc This has been renamed in more recent CXL specs, as type3 (memory expanders) can also use HDM-DB for device coherent memory. Link: acpica/acpica@7107457 Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250908160034.86471-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit c427290) Signed-off-by: Jiandi An <jan@nvidia.com>

…olution Add documentation on how to resolve conflicts between CXL Fixed Memory Windows, Platform Low Memory Holes, intermediate Switch and Endpoint Decoders. [dj]: Fixed inconsistent spacing after '.' [dj]: Fixed subject line from Alison. [dj]: Removed '::' before table from Bagas. Reviewed-by: Gregory Price <gourry@gourry.net> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit c5dca38) Signed-off-by: Jiandi An <jan@nvidia.com>

Add a helper to replace the open code detection of CXL device hierarchy root, or the host bridge. The helper will be used for delayed downstream port (dport) creation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 4fde895) Signed-off-by: Jiandi An <jan@nvidia.com>

Refactor the code in reap_dports() out to provide a helper function that reaps a single dport. This will be used later in the cleanup path for allocating a dport. Renaming to del_port() and del_dports() to mirror devm_cxl_add_dport(). [dj] Fixed up subject per Robert Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 8330671) Signed-off-by: Jiandi An <jan@nvidia.com>

@DPORT

Add a cached copy of the hardware port-id list that is available at init before all @DPORT objects have been instantiated. Change is in preparation of delayed dport instantiation. Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 02edab6) Signed-off-by: Jiandi An <jan@nvidia.com>

Group the decoder setup code in switch and endpoint port probe into a single function for each to reduce the number of functions to be mocked in cxl_test. Introduce devm_cxl_switch_port_decoders_setup() and devm_cxl_endpoint_decoders_setup(). These two functions will be mocked instead with some functions optimized out since the mock version does not do anything. Remove devm_cxl_setup_hdm(), devm_cxl_add_passthrough_decoder(), and devm_cxl_enumerate_decoders() in cxl_test mock code. In turn, mock_cxl_add_passthrough_decoder() can be removed since cxl_test does not setup passthrough decoders. __wrap_cxl_hdm_decode_init() and __wrap_cxl_dvsec_rr_decode() can be removed as well since they only return 0 when called. [dj: drop 'struct cxl_port' forward declaration (Robert)] Suggested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 68d5d97) Signed-off-by: Jiandi An <jan@nvidia.com>

The current implementation enumerates the dports during the cxl_port driver probe. Without an endpoint connected, the dport may not be active during port probe. This scheme may prevent a valid hardware dport id to be retrieved and MMIO registers to be read when an endpoint is hot-plugged. Move the dport allocation and setup to behind memdev probe so the endpoint is guaranteed to be connected. In the original enumeration behavior, there are 3 phases (or 2 if no CXL switches) for port creation. cxl_acpi() creates a Root Port (RP) from the ACPI0017.N device. Through that it enumerates downstream ports composed of ACPI0016.N devices through add_host_bridge_dport(). Once done, it uses add_host_bridge_uport() to create the ports that enumerate the PCI RPs as the dports of these ports. Every time a port is created, the port driver is attached, cxl_switch_porbe_probe() is called and devm_cxl_port_enumerate_dports() is invoked to enumerate and probe the dports. The second phase is if there are any CXL switches. When the pci endpoint device driver (cxl_pci) calls probe, it will add a mem device and triggers the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports() and attempts to discovery and create all the ports represent CXL switches. During this phase, a port is created per switch and the attached dports are also enumerated and probed. The last phase is creating endpoint port which happens for all endpoint devices. The new sequence is instead of creating all possible dports at initial port creation, defer port instantiation until a memdev beneath that dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize the creation and extension of ports with new dports as memory devices arrive. As part of this rework, switch decoder target list is amended at runtime as dports show up. While the decoders are allocated during the port driver probe, The decoders must also be updated since previously they were setup when all the dports are setup. Now every time a dport is setup per endpoint, the switch target listing need to be updated with new dport. A guard(rwsem_write) is used to update decoder targets. This is similar to when decoder_populate_target() is called and the decoder programming must be protected. Also the port registers are probed the first time when the first dport shows up. This ensures that the CXL link is established when the port registers are probed. [dj] Use ERR_CAST() (Jonathan) Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/ Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 4f06d81) Signed-off-by: Jiandi An <jan@nvidia.com>

devm_cxl_add_dport_by_dev() outside of cxl_test is done through PCI hierarchy. However with cxl_test, it needs to be done through the platform device hierarchy. Add the mock function for devm_cxl_add_dport_by_dev(). When cxl_core calls a cxl_core exported function and that function is mocked by cxl_test, the call chain causes a circular dependency issue. Dan provided a workaround to avoid this issue. Apply the method to changes from the late dport allocation changes in order to enable cxl-test. In cxl_core they are defined with "__" added in front of the function. A macro is used to define the original function names for when non-test version of the kernel is built. A bit of macros and typedefs are used to allow mocking of those functions in cxl_test. Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit d96eb90) Signed-off-by: Jiandi An <jan@nvidia.com>

…tup() With devm_cxl_switch_port_decoders_setup() being called within cxl_core instead of by the port driver probe, adjustments are needed to deal with circular symbol dependency when this function is being mock'd. Add the appropriate changes to get around the circular dependency. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 644685a) Signed-off-by: Jiandi An <jan@nvidia.com>

cxl_test uses mock functions for decoder enumaration. Add initialization of the cxld->target_map[] for cxl_test based decoders in the mock functions. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 87439b5) Signed-off-by: Jiandi An <jan@nvidia.com>

While cxl_switch_parse_cdat() is harmless to be run multiple times, it is not efficient in the current scheme where one dport is being updated at a time by the memdev probe path. Change the input parameter to the specific dport being updated to pick up the SSLBIS information for just that dport. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit d64035a) Signed-off-by: Jiandi An <jan@nvidia.com>

This patch moves the port register setup to when the first dport appears via the memdev probe path. At this point, the CXL link should be established and the register access is expected to succeed. This change addresses an error message observed when PCIe hotplug is enabled on an Intel platform. The error messages "cxl portN: Couldn't locate the CXL.cache and CXL.mem capability array header" is observed for the host bridge (CHBCR) during cxl_acpi driver probe. If the cxl_acpi module probe is running before the CXL link between the endpoint device and the RP is established, then the platform may not have exposed DVSEC ID 3 and/or DVSEC ID 7 blocks which will trigger the error message. This behavior is defined by the CXL spec r3.2 9.12.3 for RPs and DSPs, however the Intel platform also added this behavior to the host bridge. This change also needs the dport enumeration to be moved to the memdev probe path in order to address the issue. This change is not a wholly contained solution by itself. [dj: Add missing var init during port alloc] Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit f6ee249) Signed-off-by: Jiandi An <jan@nvidia.com>

port->nr_dports is used to represent how many dports added to the cxl port, it will increase in add_dport() when a new dport is being added to the cxl port, but it will not be reduced when a dport is removed from the cxl port. Currently, when the first dport is added to a cxl port, it will trigger component registers setup on the cxl port, the implementation is using port->nr_dports to confirm if the dport is the first dport. A corner case here is that adding dport could fail after port->nr_dports updating and before checking port->nr_dports for component registers setup. If the failure happens during the first dport attaching, it will cause that CXL subsystem has not chance to execute component registers setup for the cxl port. the failure flow like below: port->nr_dports = 0 dport 1 adding to the port: add_dport() # port->nr_dports: 1 failed on devm_add_action_or_reset() or sysfs_create_link() return error # port->nr_dports: 1 dport 2 adding to the port: add_dport() # port->nr_dports: 2 no failure skip component registers setup because of port->nr_dports is 2 The solution here is that moving component registers setup closer to add_dport(), so if add_dport() is executed correctly for the first dport, component registers setup on the port will be executed immediately after that. Fixes: f6ee249 ("cxl: Move port register setup to when first dport appear") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 02e7567) Signed-off-by: Jiandi An <jan@nvidia.com>

KASAN reports a stack-out-of-bounds access in validate_region_offset() while running the cxl-poison.sh unit test because the printk format specifier, %pr format, is not a match for the resource_size_t type of the variables. %pr expects struct resource pointers and attempts to dereference the structure fields, reading beyond the bounds of the stack variables. Since these messages emit an 'A exceeds B' type of message, keep the resource_size_t's and use the %pa specifier to be architecture safe. BUG: KASAN: stack-out-of-bounds in resource_string.isra.0+0xe9a/0x1690 [] Read of size 8 at addr ffff88800a7afb40 by task bash/1397 ... [] The buggy address belongs to stack of task bash/1397 [] and is located at offset 56 in frame: [] validate_region_offset+0x0/0x1c0 [cxl_core] Fixes: c3dd676 ("cxl/region: Add inject and clear poison by region offset") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> (cherry picked from commit 257c4b0) Signed-off-by: Jiandi An <jan@nvidia.com>

…x/pci_regs.h The CXL DVSECs are currently defined in cxl/core/cxlpci.h. These are not accessible to other subsystems. Move these to uapi/linux/pci_regs.h. Change DVSEC name formatting to follow the existing PCI format in pci_regs.h. The current format uses CXL_DVSEC_XYZ and the CXL defines must be changed to be PCI_DVSEC_CXL_XYZ to match existing pci_regs.h. Leave PCI_DVSEC_CXL_PORT* defines as-is because they are already defined and may be in use by userspace application(s). Update existing usage to match the name change. Update the inline documentation to refer to latest CXL spec version. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com> ---- Changes in v12->v13: - Add Dave Jiang's reviewed-by - Remove changes to existing PCI_DVSEC_CXL_PORT* defines. Update commit message. (Jonathan) Changes in v11 -> v12: - Change formatting to be same as existing definitions - Change GENMASK() -> __GENMASK() and BIT() to _BITUL() Changes in v10 -> v11: - New commit

CXL and AER drivers need the ability to identify CXL devices. Introduce set_pcie_cxl() with logic checking for CXL.mem or CXL.cache status in the CXL Flexbus DVSEC status register. The CXL Flexbus DVSEC presence is used because it is required for all the CXL PCIe devices.[1] Add boolean 'struct pci_dev::is_cxl' with the purpose to cache the CXL CXL.cache and CXl.mem status. In the case the device is an EP or USP, call set_pcie_cxl() on behalf of the parent downstream device. Once a device is created there is possibilty the parent training or CXL state was updated as well. This will make certain the correct parent CXL state is cached. Add function pcie_is_cxl() to return 'struct pci_dev::is_cxl'. [1] CXL 3.1 Spec, 8.1.1 PCIe Designated Vendor-Specific Extended Capability (DVSEC) ID Assignment, Table 8-2 Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

In preparation for CXL accelerator drivers that have a hard dependency on CXL capability initialization, arrange for the endpoint probe result to be conveyed to the caller of devm_cxl_add_memdev(). As it stands cxl_pci does not care about the attach state of the cxl_memdev because all generic memory expansion functionality can be handled by the cxl_core. For accelerators, that driver needs to know perform driver specific initialization if CXL is available, or exectute a fallback to PCIe only operation. By moving devm_cxl_add_memdev() to cxl_mem.ko it removes async module loading as one reason that a memdev may not be attached upon return from devm_cxl_add_memdev(). The diff is busy as this moves cxl_memdev_alloc() down below the definition of cxl_memdev_fops and introduces devm_cxl_memdev_add_or_reset() to preclude needing to export more symbols from the cxl_core. Signed-off-by: Dan Williams <dan.j.williams@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

…attach Make it so that upon return from devm_cxl_add_endpoint() that cxl_mem_probe() can assume that the endpoint has had a chance to complete cxl_port_probe(). I.e. cxl_port module loading has completed prior to device registration. MODULE_SOFTDEP() is not sufficient for this purpose, but a hard link-time dependency is reliable. Signed-off-by: Dan Williams <dan.j.williams@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

…ration Allow for a driver to pass a routine to be called in cxl_mem_probe() context. This ability mirrors the semantics of faux_device_create() and allows for the caller to run CXL-topology-attach dependent logic on behalf of the caller. This capability is needed for CXL accelerator device drivers that need to make decisions about enabling CXL dependent functionality in the device, or falling back to PCIe-only operation. The probe callback runs after the port topology is successfully attached for the given memdev. Signed-off-by: Dan Williams <dan.j.williams@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Differentiate CXL memory expanders (type 3) from CXL device accelerators (type 2) with a new function for initializing cxl_dev_state and a macro for helping accel drivers to embed cxl_dev_state inside a private struct. Move structs to include/cxl as the size of the accel driver private struct embedding cxl_dev_state needs to know the size of this struct. Use same new initialization with the type3 pci driver. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Add CXL initialization based on new CXL API for accel drivers and make it dependent on kernel CXL configuration. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Inside cxl/core/pci.c there are helpers for CXL PCIe initialization meanwhile cxl/pci_drv.c implements the functionality for a Type3 device initialization. Move helper functions from cxl/core/pci_drv.c to cxl/core/pci.c in order to be exported and shared with CXL Type2 device initialization. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Export cxl core functions for a Type2 driver being able to discover and map the device component registers. Use it in sfc driver cxl initialization. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing memdev state params which end up being used for DPA initialization. Allow a Type2 driver to initialize DPA simply by giving the size of its volatile hardware partition. Move related functions to memdev. Add sfc driver as the client. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when creating a memdev leading to problems when obtaining cxl_memdev_state references from a CXL_DEVTYPE_DEVMEM type. Modify check for obtaining cxl_memdev_state adding CXL_DEVTYPE_DEVMEM support. Make devm_cxl_add_memdev accessible from a accel driver. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Use cxl API for creating a cxl memory device using the type2 cxl_dev_state struct. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

…tted decoder A Type2 device configured by the BIOS can already have its HDM committed. Add a cxl_get_committed_decoder() function for cheking so after memdev creation. A CXL region should have been created during memdev initialization, therefore a Type2 driver can ask for such a region for working with the HPA. If the HDM is not committed, a Type2 driver will create the region after obtaining proper HPA and DPA space. Signed-off-by: Alejandro Lucero <alucerop@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

A CXL region struct contains the physical address to work with. Type2 drivers can create a CXL region but have not access to the related struct as it is defined as private by the kernel CXL core. Add a function for getting the cxl region range to be used for mapping such memory range by a Type2 driver. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

…ators Add unregister_region() and cxl_decoder_detach() to the accelerator driver API for a clean exit. Signed-off-by: Alejandro Lucero <alucerop@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

…mware Check if device HDM is already committed during firmware/BIOS initialization. A CXL region should exist if so after memdev allocation/initialization. Get HPA from region and map it. Signed-off-by: Alejandro Lucero <alucerop@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

…enumeration CXL region creation involves allocating capacity from Device Physical Address (DPA) and assigning it to decode a given Host Physical Address (HPA). Before determining how much DPA to allocate the amount of available HPA must be determined. Also, not all HPA is created equal, some HPA targets RAM, some targets PMEM, some is prepared for device-memory flows like HDM-D and HDM-DB, and some is HDM-H (host-only). In order to support Type2 CXL devices, wrap all of those concerns into an API that retrieves a root decoder (platform CXL window) that fits the specified constraints and the capacity available for a new region. Add a complementary function for releasing the reference to such root decoder. Based on https://lore.kernel.org/linux-cxl/168592159290.1948938.13522227102445462976.stgit@dwillia2-xfh.jf.intel.com/ Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Use cxl api for getting HPA (Host Physical Address) to use from a CXL root decoder. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Region creation involves finding available DPA (device-physical-address) capacity to map into HPA (host-physical-address) space. In order to support CXL Type2 devices, define an API, cxl_request_dpa(), that tries to allocate the DPA memory the driver requires to operate.The memory requested should not be bigger than the max available HPA obtained previously with cxl_get_hpa_freespace(). Based on https://lore.kernel.org/linux-cxl/168592158743.1948938.7622563891193802610.stgit@dwillia2-xfh.jf.intel.com/ Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Use cxl api for getting DPA (Device Physical Address) to use through an endpoint decoder. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only. Support for Type2 implies region type needs to be based on the endpoint type HDM-D[B] instead. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Davidlohr Bueso <daves@stgolabs.net> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Region creation based on Type3 devices is triggered from user space allowing memory combination through interleaving. In preparation for kernel driven region creation, that is Type2 drivers triggering region creation backed with its advertised CXL memory, factor out a common helper from the user-sysfs region setup for interleave ways. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Region creation based on Type3 devices is triggered from user space allowing memory combination through interleaving. In preparation for kernel driven region creation, that is Type2 drivers triggering region creation backed with its advertised CXL memory, factor out a common helper from the user-sysfs region setup forinterleave granularity. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Zhi Wang <zhiw@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Creating a CXL region requires userspace intervention through the cxl sysfs files. Type2 support should allow accelerator drivers to create such cxl region from kernel code. Adding that functionality and integrating it with current support for memory expanders. Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/ Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

By definition a type2 cxl device will use the host managed memory for specific functionality, therefore it should not be available to other uses. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <daves@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

Use cxl api for creating a region using the endpoint decoder related to a DPA range. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

A PIO buffer is a region of device memory to which the driver can write a packet for TX, with the device handling the transmit doorbell without requiring a DMA for getting the packet data, which helps reducing latency in certain exchanges. With CXL mem protocol this latency can be lowered further. With a device supporting CXL and successfully initialised, use the cxl region to map the memory range and use this mapping for PIO buffers. Add the disabling of those CXL-based PIO buffers if the callback for potential cxl endpoint removal by the CXL code happens. Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/) Signed-off-by: Jiandi An <jan@nvidia.com>

This flag is required to enable SMMU's nested translation feature, so as to run UVM test case in the guest OS. Because the IORT firmwares are not updated yet to enable the flag. Work it around by setting in the kernel. And there should not be any side effect. The IORT firmware should set this flag for PCI buses, to notify SMMU that all devices under the PCI bus are cache coherent. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (cherry picked from commit 4dbbaa23374684cdcd4c934483e16b3ed2eeaa72 https://github.com/mmhonap/24.04_linux-nvidia-6.17-next-vfio-cxl/tree/24.04_linux-nvidia-6.17.9-vfio-cxl) Signed-off-by: Jiandi An <jan@nvidia.com>

…RAS support CONFIG_CXL_BUS: Changed to bool for CXL Type-2 device support CONFIG_CXL_PCI: Changed to bool for CXL Type-2 device support CONFIG_CXL_MEM: Changed to y due to CXL_BUS being bool CONFIG_CXL_PORT: Changed to y due to CXL_BUS being bool CONFIG_FWCTL: Selected by CXL_BUS when bool CONFIG_CXL_RAS: New config from Terry Bowman's patches for "Enable CXL PCIe Port Protocol Error handling and logging" CONFIG_CXL_RCH_RAS: New config from Terry Bowman's patches for "Enable CXL PCIe Port Protocol Error handling and logging" CONFIG_SFC_CXL: New config from Alejandro Lucero's patches for "Type2 device basic support" Signed-off-by: Jiandi An <jan@nvidia.com>

nirmoy · 2026-02-17T12:37:44Z

debian.nvidia-6.17/config/annotations

+CONFIG_CXL_RAS                                  note<'New config from Terry Bowman patches'>
+
+CONFIG_CXL_RCH_RAS                              policy<{'amd64': 'n', 'arm64': 'n'}>
+CONFIG_CXL_RCH_RAS                              note<'New config from Terry Bowman patches'>


The note should explain what the config is doing rather than who/which series introduced it.

nirmoy · 2026-02-17T13:09:03Z

include/uapi/linux/pci_regs.h

+#define   PCI_DVSEC_CXL_REG_LOCATOR_BIR_MASK			__GENMASK(2, 0)
+#define   PCI_DVSEC_CXL_REG_LOCATOR_BLOCK_ID_MASK		__GENMASK(15, 8)
+#define   PCI_DVSEC_CXL_REG_LOCATOR_BLOCK_OFF_LOW_MASK		__GENMASK(31, 16)



nit: Was it picked cleanly ? The patch looks same but I think this portion was moved. Please mention such changes if there was any.

clsotog · 2026-02-17T20:12:51Z

Apart of the comments from Nirmoy there are some patches that have like version after the Signed-off-by like these ones:
9b8f882
44ce9ff

Also for this commit has cherry-picked line from 1ce7465 but I am not sure if Canonical has access to this git tree.

JiandiAnNVIDIA and others added 30 commits February 16, 2026 01:21

Revert "NVIDIA: VR: SAUCE: cxl: add support for cxl reset"

432a7b6

This reverts commit 0e06082. Signed-off-by: Jiandi An <jan@nvidia.com>

Dan Williams and others added 27 commits February 16, 2026 03:00

nirmoy reviewed Feb 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[linux-nvidia-6.17-next] Add CXL Type-2 device support and CXL RAS error handling#323

[linux-nvidia-6.17-next] Add CXL Type-2 device support and CXL RAS error handling#323
JiandiAnNVIDIA wants to merge 80 commits intoNVIDIA:24.04_linux-nvidia-6.17-nextfrom
JiandiAnNVIDIA:cxl_2026-02-16

JiandiAnNVIDIA commented Feb 17, 2026

Uh oh!

nirmoy Feb 17, 2026

Uh oh!

nirmoy Feb 17, 2026

Uh oh!

clsotog commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

JiandiAnNVIDIA commented Feb 17, 2026

Description

Key Features Added:

Justification

Source

Patch Breakdown (80 commits total):

Lore Links:

Upstream Status:

Testing

Build Validation:

Config Verification:

Runtime Testing:

Notes

Uh oh!

nirmoy Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

nirmoy Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

clsotog commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants