[BUG] `SVR.fit` issues when prefetching enabled

This was discovered in #7834. For reasons unknown to me, `SVR.fit` doesn't always work when `rmm` is configured to use a `rmm.mr.PrefetchResourceAdaptor`. This adaptor calls the internal `mr`, then calls `cudaMemPrefetchAsync` to schedule a prefetch of the pointer before returning.

I see two failure modes when enabling a `PrefetchResourceAdaptor` with `SVR`:

- `PrefetchResourceAdaptor(ManagedMemoryResource())`: the intercept is incorrect in cases where `y` is all the same value. Without prefetching you get the correct intercept, but with prefetching the intercept is still 0.
- `PrefetchResourceAdaptor(CudaMemoryResource())`: the `fit` call errors with a failure inside CUB.

Using a `ManagedMemoryResource()` or `CudaMemoryResource()` without prefetching works fine. Since the SVM memory management code is a bit hoky (and doesn't use rmm everywhere IIUC) I suspect this is an issue isolated to there. Raising to see if someone more versed in our C++ code can determine the issue.

The following simple pytest tests all cases, demonstrating both issues:

```python
import numpy as np
import pytest
from cuml.svm import SVR
import rmm


@pytest.mark.parametrize("prefetch", [False, True])
@pytest.mark.parametrize("mode", ["cuda", "managed"])
def test_svr_constant(mode, prefetch):
    if mode == "cuda":
        mr = rmm.mr.CudaMemoryResource()
    elif mode == "managed":
        mr = rmm.mr.ManagedMemoryResource()

    if prefetch:
        mr = rmm.mr.PrefetchResourceAdaptor(mr)

    rmm.mr.set_current_device_resource(mr)

    X = np.eye(4)
    y = np.full(4, 1.234)
    m = SVR().fit(X, y)
    y_pred = m.predict(X)
    np.testing.assert_allclose(y, y_pred)
```

<details>

<summary> Output of `pytest test.py` </summary>

```
============================= test session starts ==============================
platform linux -- Python 3.13.12, pytest-8.4.2, pluggy-1.6.0
benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jcristharif/Code/cuml
configfile: pyproject.toml
plugins: xdist-3.8.0, cases-3.9.1, benchmark-5.2.3, cov-7.0.0, hypothesis-6.151.9
collected 4 items

test.py .F.F                                                             [100%]

=================================== FAILURES ===================================
_________________________ test_svr_constant[cuda-True] _________________________

mode = 'cuda', prefetch = True

    @pytest.mark.parametrize("prefetch", [False, True])
    @pytest.mark.parametrize("mode", ["cuda", "managed"])
    def test_svr_constant(mode, prefetch):
        if mode == "cuda":
            mr = rmm.mr.CudaMemoryResource()
        elif mode == "managed":
            mr = rmm.mr.ManagedMemoryResource()
    
        if prefetch:
            mr = rmm.mr.PrefetchResourceAdaptor(mr)
    
        rmm.mr.set_current_device_resource(mr)
    
        X = np.eye(4)
        y = np.full(4, 1.234)
>       m = SVR().fit(X, y)
            ^^^^^^^^^^^^^^^

test.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../miniforge3/envs/cuml-dev/lib/python3.13/site-packages/cuml/internals/outputs.py:417: in inner
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
../../miniforge3/envs/cuml-dev/lib/python3.13/site-packages/cuml/svm/svr.py:194: in fit
    self._fit(X, y, sample_weight)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   RuntimeError: fill_n: failed inside CUB: cudaErrorInvalidDevice: invalid device ordinal

cuml/svm/svm_base.pyx:470: RuntimeError
_______________________ test_svr_constant[managed-True] ________________________

mode = 'managed', prefetch = True

    @pytest.mark.parametrize("prefetch", [False, True])
    @pytest.mark.parametrize("mode", ["cuda", "managed"])
    def test_svr_constant(mode, prefetch):
        if mode == "cuda":
            mr = rmm.mr.CudaMemoryResource()
        elif mode == "managed":
            mr = rmm.mr.ManagedMemoryResource()
    
        if prefetch:
            mr = rmm.mr.PrefetchResourceAdaptor(mr)
    
        rmm.mr.set_current_device_resource(mr)
    
        X = np.eye(4)
        y = np.full(4, 1.234)
        m = SVR().fit(X, y)
        y_pred = m.predict(X)
>       np.testing.assert_allclose(y, y_pred)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       Mismatched elements: 4 / 4 (100%)
E       Max absolute difference among violations: 1.234
E       Max relative difference among violations: inf
E        ACTUAL: array([1.234, 1.234, 1.234, 1.234])
E        DESIRED: array([0., 0., 0., 0.])

test.py:24: AssertionError
=========================== short test summary info ============================
FAILED test.py::test_svr_constant[cuda-True] - RuntimeError: fill_n: failed i...
FAILED test.py::test_svr_constant[managed-True] - AssertionError: 
========================= 2 failed, 2 passed in 1.88s ==========================
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `SVR.fit` issues when prefetching enabled #7842

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] SVR.fit issues when prefetching enabled #7842

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[BUG] `SVR.fit` issues when prefetching enabled #7842