LLM Obs: Move hallucination detection evaluation doc by gsvigruha · Pull Request #35309 · DataDog/documentation

gsvigruha · 2026-03-16T18:03:26Z

What does this PR do? What is the motivation?

LLM Obs: Move hallucination detection evaluation doc
Remove hallucination limitations - no longer apply since we turned this into a template
Remove large sections of the managed eval page, makes no sense anymore

Merge instructions

Merge readiness:

Ready for merge

github-actions · 2026-03-16T18:07:53Z

Preview links (active after the `build_preview` check completes)

New or renamed files

https://docs-staging.datadoghq.com/gergely.svigruha/templetized-hallu-detection/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/connect_to_account

Modified Files

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c8484e4cfa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

content/en/llm_observability/evaluations/managed_evaluations/_index.md

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0afc7c05ce

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md

content/en/llm_observability/evaluations/managed_evaluations/_index.md

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d43571c900

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-16T20:16:10Z

content/en/llm_observability/evaluations/managed_evaluations/_index.md

-{{< /tabs >}}
-
-If your LLM provider restricts IP addresses, you can obtain the required IP ranges by visiting [Datadog's IP ranges documentation][2], selecting your `Datadog Site`, pasting the `GET` URL into your browser, and copying the `webhooks` section.
+Learn more about the [compatibility requirements][2].


Preserve removed BYOK anchor target

Removing the Connect your LLM provider account section also removed the #connect-your-llm-provider-account anchor, but content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md still links to /llm_observability/evaluations/managed_evaluations#connect-your-llm-provider-account ([2]). After this change, users following that custom-evaluation setup link are dropped at the top of the managed page with no matching section, so the provider-connection step is no longer reachable from the documented flow.

Useful? React with 👍 / 👎.

joepeeples

Approved with a couple small edit suggestions, thanks!

joepeeples · 2026-03-16T20:26:48Z

content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md


+## Estimated token usage
+
+You can monitor the token usage of your LLM evaluations using [this dashboard][8].


Suggested change

You can monitor the token usage of your LLM evaluations using [this dashboard][8].

You can monitor the token usage of your LLM evaluations using the [LLM Evaluations Token Usage dashboard][8].

...t/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md

…_evaluations/_index.md Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6db9d84b86

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

content/en/llm_observability/evaluations/managed_evaluations/_index.md

...t/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md

….com:DataDog/documentation into gergely.svigruha/templetized-hallu-detection

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6713bbbe86

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

...ent/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/connect_to_account.md

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 282d759948

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-16T21:49:59Z

content/en/llm_observability/evaluations/managed_evaluations/_index.md

+- [Language Mismatch][3] - Flags responses that are written in a different language than the user’s input
+- [Sensitive Data Scanning][4] - Flags the presence of sensitive or regulated information in model inputs or outputs


Reconcile managed evaluation scope in this page

This new “Supported managed evaluations” list now limits managed evaluations to Language Mismatch and Sensitive Data Scanning, but the overview text in the same page still says managed evaluations include sentiment, topic relevancy, toxicity, failure to answer, and hallucination. That contradiction leaves readers with incompatible setup expectations (for example, looking for evaluations that are no longer listed as supported), so the page should be made internally consistent.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c2a424c636

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

content/en/llm_observability/evaluations/evaluation_compatibility.md

* Add secret ID notes (#35272) * add notes * small edit * Update MCP docs: recommend custom connectors for Claude Desktop & claude.ai (#35285) * Update MCP docs: recommend custom connectors for Claude Desktop & claude.ai The local binary is no longer needed for Claude Desktop or claude.ai — both now support custom connectors with the remote MCP URL natively. Replaces the stdio/binary setup instructions with a link to the Claude help center guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Simplify tab title to just "Claude" to cover all Claude products Addresses PR feedback — custom connectors work across Claude (web), Claude Desktop, and Claude Cowork, so "Claude" covers them all. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Say "including Claude Cowork" instead of "including Claude Desktop" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Remove preview feature notice from prompt optimization (#35288) Removed preview feature notice for Prompt Optimization. * [DDSQL-1503] Follow-up on dd.logs() description (#35295) * Update dd.logs description * Fix spacing * [MLObs] adding clarification notes about the metrics (#35248) * adding clairification notes about the metrics * remove typo newline * explain the metrics are only generated for certain keys * [DOCS-13590] Add Fusion setup guide (#35059) * [DOCS-13590] Add Fusion setup guide * [DOCS-13590] Update preview callout * [DOCS-13590] Update preview callout text * [DOCS-13590] Add validation section * [DOCS-13590] Add US1-FED site support banner to Oracle Fusion integration setup guide * [DOCS-13590] Incorporate cswatt's feedback * [DOCS-13590] Remove ORA_FND_READ_ONLY_ACCESS_ABSTRACT permission * Remove MCP Server Preview form alert from VS Code & Cursor extension docs (#35303) Remove 'The Datadog MCP Server is in Preview. Complete this form to request access.' from both VS Code and Cursor tabs on the IDE plugins page. Made-with: Cursor Co-authored-by: Sumedha Mehta <sumedha.mehta@datadoghq.com> * [DOCS-13433] Fix valid tag characters to include commas (#35249) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Docs13590/fusion integration ga (#35315) * [DOCS-13590] Remove preview banner and make doc public * [DOCS-13590] Add Oracle Fusion integration setup guide * [DOCS-13642] Add US1-FED port restriction note to log forwarding docs (#35313) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Update Go Live Debugger page with eBPF limitations (#35310) * [DOCS-12531] Update integration developers getting started guide (#34741) * Rewrite requirements and getting-started * Update links * Make Vale corrections * Apply suggestions from code review Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> Co-authored-by: Dominic Medina <115744456+dd-dominic@users.noreply.github.com> Co-authored-by: Eva Parish <eva.parish@datadoghq.com> --------- Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> Co-authored-by: Dominic Medina <115744456+dd-dominic@users.noreply.github.com> * [DOCS-13670] Standardize buffer section in destination docs (#35267) * [DOCS-13670] Standardize buffer section in destination docs Replace destination_buffer_numbered with destination_buffer shortcode. * updates * small edit * small edit * add for splunk hec * Translation Pipeline PR (#35291) * Translated file updates * Translated file updates * Translated file updates * fix erroneously translated `tab` shortcodes * fix malformed link syntax --------- Co-authored-by: webops-guacbot[bot] <214537265+webops-guacbot[bot]@users.noreply.github.com> Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> * Add assets to support the Cdocs stepper (not in use yet) (#35312) * Sketch in stepper styles * Tweak styles * Check off completed steps * Flesh out example steps * Make steps searchable * Nudge elements * Update example step * Tweak stepper behavior * Use a green checkmark circle to mark completed tasks * Tweak button wording * Tweak wording * Tweak stepper line width * Tweak appearance * Improve focus visibility * Improve accessibility * Improve accessibility * Tweak checkmark * Tweak button text size * Tweak loading behavior * Button tweaks * Tweaks * Update demo markup * [wip] Incorporate feedback * Make the clicked step the active step * Prevent step titles from being hidden under the sticky menu * Tweak reset behavior * Style expand/collapse buttons as links * Improve responsiveness * Tweak styles * Tweak icons * Tweak spacing * Fix stepper icon URLs * Tone down expand/collapse toggle styling (#35284) Reduce visual weight of the expand all / collapse toggle so it reads as a quiet utility control rather than competing with step titles. - font-size: 16px → 14px - font-weight: 600 → 500 - text-transform: uppercase → none (sentence case) - Add subtle letter-spacing * Tweaks * Delete stepper demo file * Revert changes in package.json * Implement Codex feedback * Fix bug * Update assets/styles/components/_collapsible-section.scss Co-authored-by: StefonSimmons <57869435+StefonSimmons@users.noreply.github.com> --------- Co-authored-by: Brett Blue <84536271+brett0000FF@users.noreply.github.com> Co-authored-by: StefonSimmons <57869435+StefonSimmons@users.noreply.github.com> * LLM Obs: Move hallucination detection evaluation doc (#35309) * move hallucination doc * tweaks * add back screenshot * remove usused code * fixlinks * Update content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> * add back account * links * fix title * more fixes --------- Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> --------- Co-authored-by: May Lee <may.lee@datadoghq.com> Co-authored-by: Reilly Wood <163153147+rgwood-dd@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Charles Jacquet <charles.jacquet@datadoghq.com> Co-authored-by: Mariana Dutra <88353514+mariddc@users.noreply.github.com> Co-authored-by: Xinyuan Guo <xinyuan.guo@datadoghq.com> Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> Co-authored-by: sumedham <87997309+sumedham@users.noreply.github.com> Co-authored-by: Sumedha Mehta <sumedha.mehta@datadoghq.com> Co-authored-by: Rosa Trieu <107086888+rtrieu@users.noreply.github.com> Co-authored-by: Esther Kim <esther.kim@datadoghq.com> Co-authored-by: ajwerner <awerner32@gmail.com> Co-authored-by: Eva Parish <eva.parish@datadoghq.com> Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> Co-authored-by: Dominic Medina <115744456+dd-dominic@users.noreply.github.com> Co-authored-by: webops-guacbot[bot] <214537265+webops-guacbot[bot]@users.noreply.github.com> Co-authored-by: Jen Gilbert <jen.gilbert@datadoghq.com> Co-authored-by: Brett Blue <84536271+brett0000FF@users.noreply.github.com> Co-authored-by: StefonSimmons <57869435+StefonSimmons@users.noreply.github.com> Co-authored-by: Gergely Svigruha <gsvigruha@users.noreply.github.com>

move hallucination doc

e3fe6ff

tweaks

c8484e4

gsvigruha changed the title ~~move hallucination doc~~ LLM Obs: Move hallucination detection evaluation doc Mar 16, 2026

gsvigruha marked this pull request as ready for review March 16, 2026 18:50

gsvigruha requested a review from a team as a code owner March 16, 2026 18:50

add back screenshot

05be54c

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

content/en/llm_observability/evaluations/managed_evaluations/_index.md Outdated Show resolved Hide resolved

remove usused code

0afc7c0

joepeeples self-assigned this Mar 16, 2026

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md Outdated Show resolved Hide resolved

content/en/llm_observability/evaluations/managed_evaluations/_index.md Outdated Show resolved Hide resolved

fixlinks

d43571c

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

joepeeples approved these changes Mar 16, 2026

View reviewed changes

Update content/en/llm_observability/evaluations/custom_llm_as_a_judge…

6db9d84

…_evaluations/_index.md Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com>

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

content/en/llm_observability/evaluations/managed_evaluations/_index.md Show resolved Hide resolved

...t/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md Show resolved Hide resolved

gsvigruha added 3 commits March 16, 2026 17:08

add back account

5981743

Merge branch 'gergely.svigruha/templetized-hallu-detection' of github…

003dc83

….com:DataDog/documentation into gergely.svigruha/templetized-hallu-detection

links

6713bbb

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

...ent/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/connect_to_account.md Show resolved Hide resolved

fix title

282d759

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

more fixes

c2a424c

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

content/en/llm_observability/evaluations/evaluation_compatibility.md Show resolved Hide resolved

gsvigruha merged commit 2202029 into master Mar 17, 2026
19 checks passed

gsvigruha deleted the gergely.svigruha/templetized-hallu-detection branch March 17, 2026 00:21


		## Estimated token usage

		You can monitor the token usage of your LLM evaluations using [this dashboard][8].

		- [Language Mismatch][3] - Flags responses that are written in a different language than the user’s input
		- [Sensitive Data Scanning][4] - Flags the presence of sensitive or regulated information in model inputs or outputs

Conversation

gsvigruha commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do? What is the motivation?

Merge instructions

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview links (active after the build_preview check completes)

New or renamed files

Modified Files

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

joepeeples left a comment

Choose a reason for hiding this comment

Uh oh!

joepeeples Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gsvigruha commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

Preview links (active after the `build_preview` check completes)