From dafdfb8433715eaafa45a1c2e8b5dc8578432ef7 Mon Sep 17 00:00:00 2001 From: Charles Jacquet Date: Mon, 16 Mar 2026 18:01:43 -0400 Subject: [PATCH 1/3] Enhance documentation for advanced experiment runs Updated description to specify running experiments on a subset of the dataset. Added a new section on running experiments on a subset of the dataset with examples. --- .../experiments/advanced_runs.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/content/en/llm_observability/experiments/advanced_runs.md b/content/en/llm_observability/experiments/advanced_runs.md index fe4e91df67e..db8b6db2671 100644 --- a/content/en/llm_observability/experiments/advanced_runs.md +++ b/content/en/llm_observability/experiments/advanced_runs.md @@ -1,10 +1,22 @@ --- title: Advanced Experiment Runs -description: Run experiments multiple times to account for model variability and automate experiment execution in CI/CD pipelines. +description: Run experiments multiple times to account for model variability on a subset of the dataset, and automate experiment execution in CI/CD pipelines. --- This page discusses advanced topics in running experiments, including [multiple experiment runs](#multiple-runs) and [setting up experiments in CI/CD](#setting-up-your-experiment-in-cicd). +## Run an Experiment on a subset of the Dataset + +First, add tags to your dataset records. They can be a unique identifier (e.g `name:test_use_case_1`) or represent a property of the scenario (e.g `difficulty:hard`). + +Then, use the `tags` argument of `LLMObs.pull_dataset()` to filter down the dataset to the records you want to run an Experiment on. + +Example +``` +LLMObs.pull_dataset(dataset_name="my-dataset", tags=["env:prod", "version:1.0"]) +``` +Finally, run the Experiment as usual. + ## Multiple runs You can run the same experiment multiple times to account for model non-determinism. You can use the [LLM Observability Python SDK][1] or [Experiments API][2] to specify how many iterations to run; subsequently, each dataset record is executed that many times using the same tasks and evaluators. @@ -215,4 +227,4 @@ jobs: [1]: /llm_observability/instrumentation/sdk?tab=python [2]: /llm_observability/experiments/api -[3]: https://app.datadoghq.com/llm/experiments \ No newline at end of file +[3]: https://app.datadoghq.com/llm/experiments From 8f5f4d5c1480f7fc5f522b7061dd74edbef9f6a4 Mon Sep 17 00:00:00 2001 From: Charles Jacquet Date: Tue, 17 Mar 2026 12:36:59 -0400 Subject: [PATCH 2/3] Update content/en/llm_observability/experiments/advanced_runs.md Co-authored-by: Ida Adjivon <65119712+iadjivon@users.noreply.github.com> --- content/en/llm_observability/experiments/advanced_runs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/llm_observability/experiments/advanced_runs.md b/content/en/llm_observability/experiments/advanced_runs.md index db8b6db2671..a2fb844fb4b 100644 --- a/content/en/llm_observability/experiments/advanced_runs.md +++ b/content/en/llm_observability/experiments/advanced_runs.md @@ -5,7 +5,7 @@ description: Run experiments multiple times to account for model variability on This page discusses advanced topics in running experiments, including [multiple experiment runs](#multiple-runs) and [setting up experiments in CI/CD](#setting-up-your-experiment-in-cicd). -## Run an Experiment on a subset of the Dataset +## Run an Experiment on a subset of the dataset First, add tags to your dataset records. They can be a unique identifier (e.g `name:test_use_case_1`) or represent a property of the scenario (e.g `difficulty:hard`). From d837ce563178acea686eea1b252f1a599bc31527 Mon Sep 17 00:00:00 2001 From: Charles Jacquet Date: Tue, 17 Mar 2026 12:37:06 -0400 Subject: [PATCH 3/3] Update content/en/llm_observability/experiments/advanced_runs.md Co-authored-by: Ida Adjivon <65119712+iadjivon@users.noreply.github.com> --- content/en/llm_observability/experiments/advanced_runs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/llm_observability/experiments/advanced_runs.md b/content/en/llm_observability/experiments/advanced_runs.md index a2fb844fb4b..92a93fe6795 100644 --- a/content/en/llm_observability/experiments/advanced_runs.md +++ b/content/en/llm_observability/experiments/advanced_runs.md @@ -7,7 +7,7 @@ This page discusses advanced topics in running experiments, including [multiple ## Run an Experiment on a subset of the dataset -First, add tags to your dataset records. They can be a unique identifier (e.g `name:test_use_case_1`) or represent a property of the scenario (e.g `difficulty:hard`). +First, add tags to your dataset records. These tags can be unique identifiers (for example, `name:test_use_case_1`) or represent properties of the scenario (for example, `difficulty:hard`). Then, use the `tags` argument of `LLMObs.pull_dataset()` to filter down the dataset to the records you want to run an Experiment on.