Skip to content

[Docs] Convert Sphinx docs to Fern#2411

Draft
msarahan wants to merge 9 commits into
rapidsai:mainfrom
msarahan:codex/migrate-docs-to-fern
Draft

[Docs] Convert Sphinx docs to Fern#2411
msarahan wants to merge 9 commits into
rapidsai:mainfrom
msarahan:codex/migrate-docs-to-fern

Conversation

@msarahan
Copy link
Copy Markdown
Contributor

@msarahan msarahan commented May 19, 2026

Summary

Related to rapidsai/build-infra#364 and follows the rapidsai/cuvs#2067 Fern migration pattern for the non-API documentation surface.

  • add Fern docs configuration, build wrapper, migrated pages, and docs navigation
  • generate API reference pages through the existing Sphinx/Breathe/autodoc pipeline instead of a handwritten source parser
  • restore the old API wrapper pages under fern/sphinx/source, generate Doxygen XML, run Sphinx with sphinx-markdown-builder, and copy rendered Markdown into Fern pages at build time
  • replace the Sphinx docs CI flow with Fern validation through fern/build_docs.sh
  • keep the minimal Doxygen/Sphinx/Breathe/autodoc docs dependencies required for API extraction and regenerate conda environment files
  • add build.sh docs, Fern CODEOWNERS, and PR change-detection updates

Testing

  • python -m pytest fern/scripts/tests
  • python -m py_compile fern/scripts/generate_api_reference.py fern/scripts/tests/test_generate_api_reference.py fern/scripts/tests/test_build_docs.py
  • pre-commit run --all-files

Local ./fern/build_docs.sh check and ./build.sh docs stop before Fern validation because this Mac does not have the docs environment installed (doxygen, Sphinx, and importable RMM packages). The generated docs path is now covered by unit tests and is expected to run in the RAPIDS docs environment from dependencies.yaml.

@msarahan msarahan requested review from a team as code owners May 19, 2026 22:49
@msarahan msarahan added doc Documentation improvement Improvement / enhancement to an existing function labels May 19, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replace Sphinx with Fern: update CI/workflows and build scripts, change conda/dependency manifests, add Fern site/config and build helper, add an API-reference generator with tests, add many generated Fern C++/Python pages, and remove old Sphinx docs.

Changes

Documentation System Migration

Layer / File(s) Summary
All migration changes
multiple files across repo (CI, build, conda, fern/, docs/, python/)
Consolidated changes implementing the migration from Sphinx to Fern: CODEOWNERS and workflow exclusions; build.sh and CI ci/build_docs.sh updated to run Fern; conda environments and dependencies.yaml updated (remove Sphinx/Breathe, add Node.js>=18 and tooling); add fern/docs.yml and fern/fern.config.json; add fern/build_docs.sh, fern/scripts/generate_api_reference.py and tests; add many generated fern/pages/* C++ and Python API pages; remove prior docs/* Sphinx artifacts; small test and docstring fixes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

improvement

Suggested reviewers

  • wence-
  • shrshi
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 1.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[Docs] Convert Sphinx docs to Fern' clearly describes the main change: migrating documentation from Sphinx to Fern. It is specific, concise, and directly reflects the primary objective of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly explains the migration from Sphinx to Fern documentation, including configuration changes, CI updates, and testing performed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
fern/pages/cpp_api/cpp-api-namespaces.md (2)

21-39: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Duplicate heading: differentiate class declaration from constructor.

The heading "Exec Policy" appears twice (lines 21 and 31), which hinders navigation. Consider using distinct headings such as "Exec Policy Class" and "Exec Policy Constructor", or use sub-headings (e.g., ####) for the constructor.

The same issue occurs with "Exec Policy Nosync" at lines 51 and 61.

Note: Since this is a generated file, the fix should be applied in fern/scripts/generate_api_reference.py to distinguish between class/type declarations and their member functions (constructors).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/cpp_api/cpp-api-namespaces.md` around lines 21 - 39, The generated
markdown has duplicate headings like "Exec Policy" and "Exec Policy Nosync" for
both the type/class declaration and its constructor; update
fern/scripts/generate_api_reference.py so the generator emits distinct headings
for declarations vs members (e.g., "Exec Policy Class" or "Exec Policy
Constructor", or render the constructor as a deeper sub-heading such as ####)
when producing names for class/type declarations and member
functions/constructors (ensure the logic that builds headings for symbols such
as Exec Policy and Exec Policy Nosync differentiates declaration_kind and
member_kind and appends or adjusts heading level accordingly).

47-55: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct file reference and fix truncated documentation descriptions in cpp-api-utilities.md.

The documentation strings are truncated mid-sentence at lines 49, 59, and 79 in cpp-api-utilities.md:

  • Line 49: "Determine at runtime if the CUDA driver supports the stream-ordered" (missing subject of check)
  • Line 59: "Check whether the specified cudaMemAllocationHandleType is supported on the present" (missing target device/driver)
  • Line 79: "Determine at runtime if the CUDA driver/runtime supports the stream-ordered" (missing subject of check)

Since this is a generated file, the truncation likely originates in the generator script (fern/scripts/generate_api_reference.py). Update it to preserve complete documentation strings without cutting them off mid-phrase.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/cpp_api/cpp-api-namespaces.md` around lines 47 - 55, The generated
docs show truncated sentences in cpp-api-utilities.md (phrases ending
"stream-ordered" and "present" coming from the generator), so update the
generator script generate_api_reference.py to stop slicing or truncating
description strings: locate the code that formats or splits the source docstring
(look for operations like [:N], splitlines(), fixed-width formatting or column
truncation) and change it to emit the full description (use the raw/whole
docstring, join wrapped lines only for display without dropping words, or
increase the length limit), then regenerate the docs so the full sentences
(e.g., the CUDA "stream-ordered" support checks and the
cudaMemAllocationHandleType target device/driver sentence) are preserved.
🧹 Nitpick comments (2)
fern/scripts/generate_api_reference.py (1)

283-285: ⚡ Quick win

Disambiguate overload headings to eliminate duplicate ### titles (MD024).

Using only entry.name as the heading text creates repeated headings for overloaded symbols across generated pages.

Suggested patch
 def render_cpp_entry(entry: CppEntry) -> list[str]:
-    lines = [f"### {entry.name}", ""]
+    lines = [f"### {entry.name} (`{entry.source.name}:{entry.line}`)", ""]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/scripts/generate_api_reference.py` around lines 283 - 285, The generated
C++ entry headings use only entry.name which causes duplicate "###" titles for
overloads; update render_cpp_entry to disambiguate headings by including a
unique identifier such as the function signature or parameter list (e.g., append
entry.signature or formatted parameter types) or an overload index from CppEntry
so each heading becomes "### {entry.name} — {entry.signature}" (or similar)
ensuring overloads produce unique Markdown headings; locate render_cpp_entry and
CppEntry usage to pick the available field (signature/prototype/params) and
update the heading generation accordingly.
fern/scripts/tests/test_generate_api_reference.py (1)

59-64: ⚡ Quick win

Add regression assertions for leaked Doxygen tokens in generated signatures.

This test currently misses the malformed-output class of failures where * @brief`` / */ leaks into fenced C++ signatures.

Suggested patch
     generated_text = "\n".join(
         path.read_text(encoding="utf-8")
         for path in [cpp_index, cpp_memory, python_rmm, python_mr]
     )
     assert ".. automodule::" not in generated_text
     assert "```{doxygengroup}" not in generated_text
+    assert "* `@brief`" not in generated_text
+    assert "*/" not in generated_text

As per coding guidelines **/*test*.{py,cpp,c,h,hpp}: Make sure to update unit tests when implementing features or bug-fixes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/scripts/tests/test_generate_api_reference.py` around lines 59 - 64, The
test currently checks for leaked Doxygen tokens only for ".. automodule::" and
"{doxygengroup}", but it misses C++ comment fragments that can leak into fenced
signatures; update the test in test_generate_api_reference (where generated_text
is built from cpp_index, cpp_memory, python_rmm, python_mr) to also assert that
"* `@brief`" and "*/" are not present in generated_text so malformed C++ doc
comments are caught (i.e., add assertions that "* `@brief`" not in generated_text
and "*/" not in generated_text).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@fern/build_docs.sh`:
- Around line 46-52: The fallback branch that sets FERN_CMD=("npx" "--yes"
"fern-api") should pin the package to a tested version or prefer CI-provided
FERN_CLI; change the else branch so FERN_CMD uses "npx --yes
fern-api@<TESTED_VERSION>" (or document/require CI to export FERN_CLI) instead
of the unpinned "fern-api" to ensure reproducible builds; update the string in
the FERN_CMD assignment and any README/CI docs to reflect the chosen pinned
version.

In `@fern/pages/cpp_api/cpp-api-thrust-integrations.md`:
- Around line 21-39: The generated docs duplicate the "### Exec Policy" (and
"### Exec Policy Nosync") headings for the class and its constructor, causing
navigation ambiguity; update fern/scripts/generate_api_reference.py to detect
when a heading corresponds to a constructor/initializer and emit a distinct
heading (e.g., "### Exec Policy Constructor" or include the signature) for the
exec_policy constructor instead of reusing the class heading—apply the same
change for exec_policy_nosync and any other class+constructor pairs by matching
class names like exec_policy and their constructor signatures and altering the
heading generation logic accordingly.

In `@fern/pages/cpp_api/cpp-api-utilities.md`:
- Around line 77-85: Update the truncated description for
runtime_async_managed_alloc so it reads fully: "Determine at runtime if the CUDA
driver/runtime supports the stream-ordered managed memory pools" (or equivalent
wording that includes "stream-ordered managed memory pools"); edit the markdown
section that documents the struct runtime_async_managed_alloc and replace the
incomplete phrase "Determine at runtime if the CUDA driver/runtime supports the
stream-ordered" with the complete phrase while keeping the existing source
attribution to runtime_capabilities.hpp.
- Around line 57-65: The markdown generator is truncating multi-line doc
comments (causing the description for export_handle_type to end prematurely);
update the extraction logic that reads comments from
cpp/include/rmm/detail/runtime_capabilities.hpp so it collects the entire
multi-line comment block (not just the first line) and preserves newline
boundaries and sentences when emitting the description for struct
export_handle_type; specifically modify the parser to continue reading until the
comment terminator (or blank line) and concatenate lines (or preserve line
breaks) into the generated markdown field instead of taking only the first line.
- Around line 47-55: The generated markdown for runtime_async_alloc is
truncating the doc comment at line breaks; update the generator that extracts
comments for the symbol runtime_async_alloc (source:
cpp/include/rmm/detail/runtime_capabilities.hpp) to capture multi-line
documentation blocks rather than stopping at the first newline—modify the parser
to read full comment blocks (e.g., accumulate consecutive comment lines or parse
/** ... */ / /// blocks entirely) and use the complete comment text when
emitting the markdown so the full sentence "Determine at runtime if the CUDA
driver supports the stream-ordered memory allocator functions." is preserved.

In `@fern/pages/python_api/python-api-rmm-mr.md`:
- Around line 187-193: Duplicate level-3 headings like `get_allocated_bytes` and
`pool_handle` cause ambiguous anchors; update each duplicate heading to include
Fern's custom slug override (e.g., change the second occurrences of
`get_allocated_bytes` and `pool_handle` in python-api-rmm-mr.md and the
duplicates in python-api-rmm-pylibrmm.md to use unique slugs such as `###
\`get_allocated_bytes\` [`#get-allocated-bytes-limiting`]` and `###
\`pool_handle\` [`#pool-handle-cuda-async`]`) so each heading has a stable, unique
anchor while preserving the visible title.

In `@fern/pages/python_api/python-api-rmm-pylibrmm.md`:
- Around line 109-123: The docs contain duplicate level-3 headings (e.g.,
copy_to_host, to_device, get_allocated_bytes) which produce non-unique anchors;
update the Markdown headings to disambiguate each overload and same-named
methods in different classes — for example rename the two copy_to_host headings
to "copy_to_host (no buffer)" and "copy_to_host (with buffer)", rename duplicate
to_device headings similarly, and for get_allocated_bytes include the class name
(e.g., "get_allocated_bytes — DeviceAllocator") or move these method docs under
their parent class headings so each generated anchor is unique and stable;
adjust the headings wherever the symbols copy_to_host, to_device, and
get_allocated_bytes appear in this file.

In `@fern/pages/python_api/python-api-rmm-statistics.md`:
- Around line 113-128: The docstring wrongly repeats the same function name:
change the duplicate "push_statistics()" to "pop_statistics()" in the Statistics
context manager docstring so it reads "using `push_statistics()` and
`pop_statistics()`"; locate the text in the statistics context implementation in
python/rmm/rmm/statistics.py (around the context enter/exit description) and
update the wording, then regenerate the docs.

In `@fern/scripts/generate_api_reference.py`:
- Around line 250-264: The loop that collects declaration lines overreads
because it doesn't skip interior block-comment lines and doesn't treat one-line
brace pairs as complete; add an in_block_comment flag and update the loop around
processing suffix.splitlines() (variables: raw_line, stripped, in_block_comment,
lines, brace_depth) so that lines between "/*" and "*/" (including leading "*"
lines like "* `@brief`") are skipped, toggling in_block_comment when encountering
"/*" or "*/", and add a completion check that treats a line containing both "{"
and "}" (a one-line definition) as a finished declaration in addition to the
existing stripped.endswith(";") and brace_depth logic to prevent swallowing
adjacent declarations.

---

Outside diff comments:
In `@fern/pages/cpp_api/cpp-api-namespaces.md`:
- Around line 21-39: The generated markdown has duplicate headings like "Exec
Policy" and "Exec Policy Nosync" for both the type/class declaration and its
constructor; update fern/scripts/generate_api_reference.py so the generator
emits distinct headings for declarations vs members (e.g., "Exec Policy Class"
or "Exec Policy Constructor", or render the constructor as a deeper sub-heading
such as ####) when producing names for class/type declarations and member
functions/constructors (ensure the logic that builds headings for symbols such
as Exec Policy and Exec Policy Nosync differentiates declaration_kind and
member_kind and appends or adjusts heading level accordingly).
- Around line 47-55: The generated docs show truncated sentences in
cpp-api-utilities.md (phrases ending "stream-ordered" and "present" coming from
the generator), so update the generator script generate_api_reference.py to stop
slicing or truncating description strings: locate the code that formats or
splits the source docstring (look for operations like [:N], splitlines(),
fixed-width formatting or column truncation) and change it to emit the full
description (use the raw/whole docstring, join wrapped lines only for display
without dropping words, or increase the length limit), then regenerate the docs
so the full sentences (e.g., the CUDA "stream-ordered" support checks and the
cudaMemAllocationHandleType target device/driver sentence) are preserved.

---

Nitpick comments:
In `@fern/scripts/generate_api_reference.py`:
- Around line 283-285: The generated C++ entry headings use only entry.name
which causes duplicate "###" titles for overloads; update render_cpp_entry to
disambiguate headings by including a unique identifier such as the function
signature or parameter list (e.g., append entry.signature or formatted parameter
types) or an overload index from CppEntry so each heading becomes "###
{entry.name} — {entry.signature}" (or similar) ensuring overloads produce unique
Markdown headings; locate render_cpp_entry and CppEntry usage to pick the
available field (signature/prototype/params) and update the heading generation
accordingly.

In `@fern/scripts/tests/test_generate_api_reference.py`:
- Around line 59-64: The test currently checks for leaked Doxygen tokens only
for ".. automodule::" and "{doxygengroup}", but it misses C++ comment fragments
that can leak into fenced signatures; update the test in
test_generate_api_reference (where generated_text is built from cpp_index,
cpp_memory, python_rmm, python_mr) to also assert that "* `@brief`" and "*/" are
not present in generated_text so malformed C++ doc comments are caught (i.e.,
add assertions that "* `@brief`" not in generated_text and "*/" not in
generated_text).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3c10e2dc-b94e-4b45-b49b-9d193b6f24a5

📥 Commits

Reviewing files that changed from the base of the PR and between 3cd0279 and 0940df6.

⛔ Files ignored due to path filters (1)
  • docs/_static/RAPIDS-logo-purple.png is excluded by !**/*.png
📒 Files selected for processing (54)
  • .github/CODEOWNERS
  • .github/workflows/pr.yaml
  • build.sh
  • ci/build_docs.sh
  • conda/environments/all_cuda-129_arch-aarch64.yaml
  • conda/environments/all_cuda-129_arch-x86_64.yaml
  • conda/environments/all_cuda-132_arch-aarch64.yaml
  • conda/environments/all_cuda-132_arch-x86_64.yaml
  • dependencies.yaml
  • docs/Makefile
  • docs/conf.py
  • docs/cpp/cuda_device_management.md
  • docs/cpp/cuda_streams.md
  • docs/cpp/data_containers.md
  • docs/cpp/errors.md
  • docs/cpp/index.md
  • docs/cpp/memory_resources/index.md
  • docs/cpp/memory_resources/memory_resource_adaptors.md
  • docs/cpp/memory_resources/memory_resources.md
  • docs/cpp/rmm_namespace.md
  • docs/cpp/thrust_integrations.md
  • docs/cpp/utilities.md
  • docs/index.md
  • docs/python/allocators.md
  • docs/python/index.md
  • docs/python/librmm.md
  • docs/python/mr.md
  • docs/python/pylibrmm.md
  • docs/python/rmm.md
  • docs/python/statistics.md
  • fern/build_docs.sh
  • fern/docs.yml
  • fern/fern.config.json
  • fern/pages/cpp_api/cpp-api-cuda-device-management.md
  • fern/pages/cpp_api/cpp-api-cuda-streams.md
  • fern/pages/cpp_api/cpp-api-data-containers.md
  • fern/pages/cpp_api/cpp-api-errors.md
  • fern/pages/cpp_api/cpp-api-memory-resource-adaptors.md
  • fern/pages/cpp_api/cpp-api-memory-resources.md
  • fern/pages/cpp_api/cpp-api-namespaces.md
  • fern/pages/cpp_api/cpp-api-thrust-integrations.md
  • fern/pages/cpp_api/cpp-api-utilities.md
  • fern/pages/cpp_api/index.md
  • fern/pages/index.md
  • fern/pages/python_api/index.md
  • fern/pages/python_api/python-api-rmm-allocators.md
  • fern/pages/python_api/python-api-rmm-librmm.md
  • fern/pages/python_api/python-api-rmm-mr.md
  • fern/pages/python_api/python-api-rmm-pylibrmm.md
  • fern/pages/python_api/python-api-rmm-statistics.md
  • fern/pages/python_api/python-api-rmm.md
  • fern/pages/user_guide/guide.md
  • fern/scripts/generate_api_reference.py
  • fern/scripts/tests/test_generate_api_reference.py
💤 Files with no reviewable changes (21)
  • docs/cpp/index.md
  • docs/python/pylibrmm.md
  • docs/cpp/thrust_integrations.md
  • docs/python/rmm.md
  • docs/cpp/utilities.md
  • docs/python/librmm.md
  • docs/cpp/rmm_namespace.md
  • docs/cpp/memory_resources/memory_resources.md
  • docs/python/statistics.md
  • docs/cpp/memory_resources/memory_resource_adaptors.md
  • docs/cpp/memory_resources/index.md
  • docs/python/index.md
  • docs/cpp/data_containers.md
  • docs/Makefile
  • docs/python/allocators.md
  • docs/cpp/cuda_device_management.md
  • docs/cpp/errors.md
  • docs/conf.py
  • docs/cpp/cuda_streams.md
  • docs/python/mr.md
  • docs/index.md

Comment thread fern/build_docs.sh
Comment thread fern/pages/cpp_api/cpp-api-thrust-integrations.md Outdated
Comment thread fern/pages/cpp_api/cpp-api-utilities.md Outdated
Comment thread fern/pages/cpp_api/cpp-api-utilities.md Outdated
Comment thread fern/pages/cpp_api/cpp-api-utilities.md Outdated
Comment thread fern/pages/python_api/python-api-rmm-mr.md Outdated
Comment thread fern/pages/python_api/python-api-rmm-pylibrmm.md Outdated
Comment thread fern/pages/python_api/python-api-rmm-statistics.md Outdated
Comment thread fern/scripts/generate_api_reference.py Outdated
@msarahan msarahan marked this pull request as draft May 19, 2026 22:57
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 19, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
fern/scripts/generate_api_reference.py (1)

284-292: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize extracted C++ declaration lines before storing them.

Current extraction keeps inline // comments and opening-body { tokens, which produces malformed snippets in generated pages. This shows up as signatures containing comment text and partial definitions.

Suggested fix
-        lines.append(stripped)
-        brace_depth += stripped.count("{") - stripped.count("}")
+        candidate = re.sub(r"//.*$", "", stripped).rstrip()
+        if not candidate:
+            line += 1
+            continue
+        if "{" in candidate and not candidate.endswith("{}"):
+            candidate = candidate.split("{", 1)[0].rstrip()
+            if not candidate:
+                line += 1
+                continue
+        lines.append(candidate)
+        brace_depth += candidate.count("{") - candidate.count("}")
         if (
-            stripped.endswith(";")
-            or stripped.endswith("{}")
-            or ("{" in stripped and "}" in stripped and brace_depth == 0)
-            or (stripped.endswith("}") and brace_depth == 0)
-            or (brace_depth > 0 and "{" in stripped)
+            candidate.endswith(";")
+            or candidate.endswith("{}")
+            or (candidate.endswith("}") and brace_depth == 0)
         ):
             break
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/scripts/generate_api_reference.py` around lines 284 - 292, In
generate_api_reference.py, when building declaration lines (the block that
updates lines.append(stripped) and computes brace_depth), sanitize each stripped
string before storing: strip trailing/leading whitespace, remove any inline C++
single-line comments (// and everything after), drop stray opening-body "{"
tokens appended to declarations, and collapse multiple spaces so signatures stay
clean; then recompute brace_depth from the sanitized text and use that sanitized
value in the subsequent checks (the same variables referenced in the diff:
stripped and brace_depth) so only proper declarations/signatures are appended.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@fern/scripts/generate_api_reference.py`:
- Around line 642-646: In convert_doxygen_text the regex replacement that strips
Doxygen commands (currently replacing r"[@\\]\w+\b" with "") can concatenate
neighboring words; change that replacement to insert a single space instead and
then normalize whitespace (collapse multiple spaces/newlines and strip) after
all replacements so words remain separated; keep the existing handling for
brief/return and parameter name quoting (function convert_doxygen_text) and
ensure the final return still removes braces and calls .strip() after the
whitespace normalization.

In `@fern/scripts/tests/test_generate_api_reference.py`:
- Around line 100-103: The test is asserting a hard-coded source line number
("line 58") which is brittle; update the assertion in
test_generate_api_reference.py to check the generated heading shape without a
fixed line number by matching a numeric placeholder instead. Locate the
assertion that inspects python_pylibrmm.read_text(encoding="utf-8") for the
string containing "### `copy_to_host` (DeviceBuffer, line 58)" and replace it
with a check that the text contains the heading for copy_to_host and
DeviceBuffer and that the "line <digits>" portion is numeric (e.g., use a regex
match or assert the substring "### `copy_to_host` (DeviceBuffer, line " is
present and followed by digits). Ensure you reference the same symbols:
copy_to_host, DeviceBuffer, and python_pylibrmm.read_text when making the
change.

---

Duplicate comments:
In `@fern/scripts/generate_api_reference.py`:
- Around line 284-292: In generate_api_reference.py, when building declaration
lines (the block that updates lines.append(stripped) and computes brace_depth),
sanitize each stripped string before storing: strip trailing/leading whitespace,
remove any inline C++ single-line comments (// and everything after), drop stray
opening-body "{" tokens appended to declarations, and collapse multiple spaces
so signatures stay clean; then recompute brace_depth from the sanitized text and
use that sanitized value in the subsequent checks (the same variables referenced
in the diff: stripped and brace_depth) so only proper declarations/signatures
are appended.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4a6249b5-c65a-47ca-be8b-537e8384a72c

📥 Commits

Reviewing files that changed from the base of the PR and between 0940df6 and 82d5527.

📒 Files selected for processing (16)
  • fern/build_docs.sh
  • fern/pages/cpp_api/cpp-api-cuda-device-management.md
  • fern/pages/cpp_api/cpp-api-cuda-streams.md
  • fern/pages/cpp_api/cpp-api-data-containers.md
  • fern/pages/cpp_api/cpp-api-errors.md
  • fern/pages/cpp_api/cpp-api-memory-resource-adaptors.md
  • fern/pages/cpp_api/cpp-api-memory-resources.md
  • fern/pages/cpp_api/cpp-api-thrust-integrations.md
  • fern/pages/cpp_api/cpp-api-utilities.md
  • fern/pages/python_api/python-api-rmm-mr.md
  • fern/pages/python_api/python-api-rmm-pylibrmm.md
  • fern/pages/python_api/python-api-rmm-statistics.md
  • fern/scripts/generate_api_reference.py
  • fern/scripts/tests/test_build_docs.py
  • fern/scripts/tests/test_generate_api_reference.py
  • python/rmm/rmm/statistics.py
✅ Files skipped from review due to trivial changes (9)
  • fern/scripts/tests/test_build_docs.py
  • fern/pages/cpp_api/cpp-api-cuda-streams.md
  • fern/pages/cpp_api/cpp-api-cuda-device-management.md
  • python/rmm/rmm/statistics.py
  • fern/pages/python_api/python-api-rmm-statistics.md
  • fern/pages/cpp_api/cpp-api-utilities.md
  • fern/pages/python_api/python-api-rmm-pylibrmm.md
  • fern/pages/cpp_api/cpp-api-data-containers.md
  • fern/pages/python_api/python-api-rmm-mr.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • fern/pages/cpp_api/cpp-api-thrust-integrations.md

Comment thread fern/scripts/generate_api_reference.py Outdated
Comment thread fern/scripts/tests/test_generate_api_reference.py
@msarahan msarahan added non-breaking Non-breaking change and removed improvement Improvement / enhancement to an existing function labels May 20, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@fern/pages/cpp_api/cpp-api-data-containers.md`:
- Around line 804-805: The sentence contains a malformed inline code span around
rmm::device_uvector causing broken Markdown; fix it by ensuring the destructor
reference is properly wrapped as inline code (e.g., `rmm::device_uvector`
destructor) or by rephrasing to use plain text; update the sentence that
mentions rmm::device_uvector, resize(), shrink_to_fit(), and destructor so all
backticks are balanced and the identifiers render correctly.

In `@fern/scripts/generate_api_reference.py`:
- Around line 158-159: The LINE_COMMENT_RE currently matches Doxygen trailing
comments like "///<" or "//!<" which document the previous declaration; update
the regex used in LINE_COMMENT_RE to exclude those backward-attached markers
(e.g. add a negative lookahead to reject patterns where the third character is
'<' or '!<' so "///<" and "//!<" are not matched), and apply the same change to
the related regexes referenced around lines 287-291 so those forward parsing
patterns also ignore backward-attached Doxygen comments.
- Around line 718-722: When encountering an unsupported Doxygen command in the
is_doxygen_command branch the code currently calls collect_doxygen_text(lines,
index, "") which throws away any inline payload on the same line; change the
flow so the inline remainder is preserved by passing the current paragraph text
(or concatenating paragraph[0]) into collect_doxygen_text instead of the empty
string, then append the merged result to rendered via append_doxygen_paragraph;
update the logic around is_doxygen_command, append_doxygen_paragraph, and
collect_doxygen_text to pass and/or combine the existing paragraph string with
the collected text rather than discarding it.

In `@fern/scripts/tests/test_generate_api_reference.py`:
- Line 104: Replace the brittle exact-string assertion that expects "### Device
Buffer Constructor (device_buffer.hpp:116)" with a pattern-based check against
data_text so the test matches any numeric line number; specifically, change the
assertion referencing "### Device Buffer Constructor (device_buffer.hpp:116)" to
use a regex search that looks for "### Device Buffer Constructor
(device_buffer.hpp:" followed by one or more digits and a closing ")" (e.g., a
re.search on data_text for that pattern) so source edits that change line
numbers won't break the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 09375a27-036d-4eb7-ab29-d15f8c44d7dc

📥 Commits

Reviewing files that changed from the base of the PR and between 43b1b47 and d1696b2.

📒 Files selected for processing (10)
  • fern/pages/cpp_api/cpp-api-cuda-device-management.md
  • fern/pages/cpp_api/cpp-api-cuda-streams.md
  • fern/pages/cpp_api/cpp-api-data-containers.md
  • fern/pages/cpp_api/cpp-api-errors.md
  • fern/pages/cpp_api/cpp-api-memory-resource-adaptors.md
  • fern/pages/cpp_api/cpp-api-memory-resources.md
  • fern/pages/cpp_api/cpp-api-thrust-integrations.md
  • fern/pages/cpp_api/cpp-api-utilities.md
  • fern/scripts/generate_api_reference.py
  • fern/scripts/tests/test_generate_api_reference.py
✅ Files skipped from review due to trivial changes (1)
  • fern/pages/cpp_api/cpp-api-thrust-integrations.md

Comment thread fern/pages/cpp_api/cpp-api-data-containers.md Outdated
Comment thread fern/scripts/generate_api_reference.py Outdated
Comment thread fern/scripts/generate_api_reference.py Outdated
Comment thread fern/scripts/tests/test_generate_api_reference.py Outdated
@wence-
Copy link
Copy Markdown
Contributor

wence- commented May 20, 2026

The "autogenerated" docs from the API documentation are hilariously bad. No python object has a docstring. Almost all of the docstrings of the C++ classes have gone missing. There is no cross-linking. It looks like the approach to API code generation here is to write our own rst and doxygen implementations. This seems insane.

@msarahan
Copy link
Copy Markdown
Contributor Author

@wence- Thanks for pointing out those shortcomings. I noticed the large amount of missing content, but you're right to call out that (partially) reimplementing doxygen and other tools is a fragile foundation at best. I'm going back to the drawing board on how the old files get migrated. As I'm doing this, I am comparing the existing docs with the new ones. I will post screenshots like this for any discrepancies that are debatable.

Screenshot 2026-05-19 at 10 18 45 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Documentation non-breaking Non-breaking change

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants