feat: Update mlflow adapter to be compatible with latest 3.x versions by hemajv · Pull Request #144 · eval-hub/eval-hub-sdk

hemajv · 2026-06-11T19:55:14Z

What and why

Updates the MLflow adapter to be compatible with 3.x MLflow servers

Closes #134

Type

Testing

Tests added or updated
Tested manually

cc @ruivieira @suppathak

Summary by CodeRabbit

Release Notes

New Features
- Improved artifact uploads for ODH workspace-style URIs, including smarter server path construction and fallback handling.
Bug Fixes
- Now treats any non-2xx response from the MLflow server as an error.
- Improved trace retrieval and materialization behavior for broader compatibility.
Tests
- Expanded unit coverage for trace parsing across multiple response shapes.
- Added validations for artifact server path handling across supported URI formats.

coderabbitai · 2026-06-11T19:55:31Z

📝 Walkthrough

Walkthrough

MLflow adapter changes update HTTP error handling, artifact upload path resolution, trace response parsing, and trace retrieval/materialization calls for MLflow 3.x compatibility.

Changes

MLflow 3.x Trace API Compatibility

Layer / File(s)	Summary
HTTP errors and artifact paths `src/evalhub/adapter/mlflow.py`, `tests/unit/test_mlflow_traces.py`	HTTP handling now raises on any non-2xx response; artifact upload path resolution uses ODH workspace-style URIs when present, otherwise falls back to experiment and run identifiers. Tests cover ODH, upstream run-id, and HTTP-proxied artifact URIs.
Trace parsing normalization `src/evalhub/adapter/mlflow.py`, `tests/unit/test_mlflow_traces.py`	Trace parsing now normalizes list-or-dict metadata and accepts trace, trace_info, and flat search envelopes while extracting request, experiment, status, tag, metadata, and data fields. Tests cover flat and trace_info-wrapped responses.
Trace API calls `src/evalhub/adapter/mlflow.py`, `tests/unit/test_mlflow_traces.py`	`TracesNamespace.get` now calls `/traces/{id}/info`, `materialize` filters by `attribute.run_id`, and search parsing is validated with flat results and pagination.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

eval-hub/eval-hub-sdk#129: Introduces the MLflow trace subsystem that this PR updates for MLflow 3.x response shapes and endpoints.

Suggested reviewers

mariusdanciu
williamcaban
ppadashe-psp

Poem

🐰 Hop, hop through MLflow's tracey maze,
New paths and envelopes brighten the haze.
Info at /info, filters now align,
Artifact trails follow the right design.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	The artifact upload path and generic HTTP error-handling changes are broader than the trace-compatibility issue scope.	Move the artifact-upload and _handle error-handling changes into a separate PR or document them as part of this issue's scope.
Docstring Coverage	⚠️ Warning	Docstring coverage is 52.63% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description includes the problem, linked issue, type, and testing info, matching the template well enough.
Linked Issues check	✅ Passed	The trace get/search/materialize changes address the MLflow 3.x compatibility issues described in `#134`.
Title check	✅ Passed	The title accurately summarizes the main change: MLflow adapter compatibility updates for MLflow 3.x.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/evalhub/adapter/mlflow.py (1)

849-878: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Hydrate each trace before writing files.

materialize() still serializes the objects returned by search(). On the MLflow 3.x path, those rows are metadata-only, so trace.data stays empty and the exported JSON still lacks spans — the exact behavior issue #134 calls out. Fetch each request_id through self.get(..., experiment_id) before writing the file.

💡 Suggested fix

             for trace in traces:
+                if trace.info.request_id:
+                    trace = self.get(trace.info.request_id, experiment_id)
                 tid = re.sub(r"[^a-zA-Z0-9_\-]", "_", trace.info.request_id)
                 if not tid:
                     tid = uuid.uuid4().hex

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/evalhub/adapter/mlflow.py` around lines 849 - 878, The traces returned by
self.search(...) are metadata-only on MLflow 3.x, so before writing each file
call self.get(request_id, experiment_id=experiment_id) to hydrate the full trace
(replace or merge trace.data with the hydrated result) — update the loop that
iterates over traces from search() to fetch the full trace via
self.get(trace.info.request_id, experiment_id=experiment_id) and then use that
hydrated trace when building trace_dict and writing to file; ensure you still
fall back to the uuid when request_id is empty and preserve the existing
filename logic (references: search, get, trace.info, trace.data).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/test_mlflow_traces.py`:
- Around line 77-132: Add pytest unit markers to the new tests in
tests/unit/test_mlflow_traces.py so they participate in marker-based selection:
either decorate each test function (test_parse_trace_flat_search_response,
test_parse_trace_info_endpoint_response,
test_artifact_server_path_odh_workspace,
test_artifact_server_path_upstream_run_ids,
test_artifact_server_path_http_proxied) with `@pytest.mark.unit`, or set
pytestmark = [pytest.mark.unit] at the module top; ensure pytest is imported if
adding decorators.

---

Outside diff comments:
In `@src/evalhub/adapter/mlflow.py`:
- Around line 849-878: The traces returned by self.search(...) are metadata-only
on MLflow 3.x, so before writing each file call self.get(request_id,
experiment_id=experiment_id) to hydrate the full trace (replace or merge
trace.data with the hydrated result) — update the loop that iterates over traces
from search() to fetch the full trace via self.get(trace.info.request_id,
experiment_id=experiment_id) and then use that hydrated trace when building
trace_dict and writing to file; ensure you still fall back to the uuid when
request_id is empty and preserve the existing filename logic (references:
search, get, trace.info, trace.data).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 88318ad6-77ae-437c-8207-4e309ad86bf9

📥 Commits

Reviewing files that changed from the base of the PR and between 162fa69 and 014bfdb.

📒 Files selected for processing (2)

src/evalhub/adapter/mlflow.py
tests/unit/test_mlflow_traces.py

coderabbitai · 2026-06-11T19:59:35Z

+def test_parse_trace_flat_search_response() -> None:
+    """Trace search returns trace fields at the top level, not wrapped in ``info``."""
+    raw = {
+        "request_id": "tr-abc123",
+        "experiment_id": "1",
+        "timestamp_ms": 1700000000000,
+        "execution_time_ms": 45,
+        "status": "OK",
+        "tags": [{"key": "framework", "value": "langgraph"}],
+        "request_metadata": [{"key": "source", "value": "test"}],
+    }
+    trace = _parse_trace(raw)
+    assert trace.info.request_id == "tr-abc123"
+    assert trace.info.experiment_id == "1"
+    assert trace.info.status == "OK"
+    assert trace.info.tags == {"framework": "langgraph"}
+    assert trace.info.request_metadata == {"source": "test"}
+
+
+def test_parse_trace_info_endpoint_response() -> None:
+    raw = {
+        "trace_info": {
+            "request_id": "tr-xyz",
+            "experiment_id": "2",
+            "timestamp_ms": 100,
+            "execution_time_ms": 10,
+            "status": "OK",
+            "tags": [],
+            "request_metadata": [],
+        }
+    }
+    trace = _parse_trace(raw)
+    assert trace.info.request_id == "tr-xyz"
+    assert trace.info.experiment_id == "2"
+
+
+def test_artifact_server_path_odh_workspace() -> None:
+    uri = "mlflow-artifacts:/workspaces/ws/1/run-abc/artifacts"
+    path = MlflowClient._artifact_server_path(uri, "results/out.json")
+    assert path == "/api/2.0/mlflow-artifacts/artifacts/1/run-abc/artifacts/results/out.json"
+
+
+def test_artifact_server_path_upstream_run_ids() -> None:
+    uri = "/private/tmp/mlflow/artifacts/6/run-abc/artifacts"
+    path = MlflowClient._artifact_server_path(
+        uri, "results/out.json", experiment_id="6", run_id="run-abc"
+    )
+    assert path == (
+        "/api/2.0/mlflow-artifacts/artifacts/6/run-abc/artifacts/results/out.json"
+    )
+
+
+def test_artifact_server_path_http_proxied() -> None:
+    uri = "http://localhost:5000/api/2.0/mlflow-artifacts/artifacts/1/run-abc/artifacts"
+    path = MlflowClient._artifact_server_path(uri, "out.json")
+    assert path == "/api/2.0/mlflow-artifacts/artifacts/1/run-abc/artifacts/out.json"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the required pytest markers to these new tests.

These additions are unmarked, so they will not participate in the repo's marker-based test selection. Add @pytest.mark.unit here, or set pytestmark once at module scope if this whole file is a unit suite.

💡 Suggested fix

import pytest from evalhub.adapter.mlflow import ( MlflowClient, TracesNamespace, _parse_trace, ) + +pytestmark = [pytest.mark.unit]

As per coding guidelines, tests/**/*.py: Mark tests with pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.adapter, @pytest.mark.e2e.

Also applies to: 138-160

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/test_mlflow_traces.py` around lines 77 - 132, Add pytest unit markers to the new tests in tests/unit/test_mlflow_traces.py so they participate in marker-based selection: either decorate each test function (test_parse_trace_flat_search_response, test_parse_trace_info_endpoint_response, test_artifact_server_path_odh_workspace, test_artifact_server_path_upstream_run_ids, test_artifact_server_path_http_proxied) with `@pytest.mark.unit`, or set pytestmark = [pytest.mark.unit] at the module top; ensure pytest is imported if adding decorators.

Source: Coding guidelines

ruivieira

LGTM, thanks @hemajv

Update mlflow adapter to be compatible with latest 3.x versions

014bfdb

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

ruivieira self-assigned this Jun 12, 2026

ruivieira added the community-pr External contributor PR label Jun 12, 2026

ruivieira added this to EvalHub Roadmap Jun 12, 2026

github-project-automation Bot moved this to Todo in EvalHub Roadmap Jun 12, 2026

ruivieira self-requested a review June 29, 2026 14:06

ruivieira moved this from Todo to In Progress in EvalHub Roadmap Jun 29, 2026

ruivieira added 2 commits June 29, 2026 17:47

Merge branch 'upstream-main' into pr-144

0462e7a

chore: apply pre-commit fixes (ruff-format, end-of-file-fixer)

7727239

ruivieira changed the title ~~Update mlflow adapter to be compatible with latest 3.x versions~~ feat: Update mlflow adapter to be compatible with latest 3.x versions Jun 29, 2026

ruivieira approved these changes Jun 29, 2026

View reviewed changes

ruivieira merged commit e6bd150 into eval-hub:main Jun 29, 2026
7 of 8 checks passed

github-project-automation Bot moved this from In Progress to Done in EvalHub Roadmap Jun 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Update mlflow adapter to be compatible with latest 3.x versions#144

feat: Update mlflow adapter to be compatible with latest 3.x versions#144
ruivieira merged 3 commits into
eval-hub:mainfrom
hemajv:update-mlflow

hemajv commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

ruivieira left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hemajv commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What and why

Type

Testing

Summary by CodeRabbit

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

ruivieira left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hemajv commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading