diff --git a/content/en/llm_observability/experiments/advanced_runs.md b/content/en/llm_observability/experiments/advanced_runs.md index fe4e91df67e..db8b6db2671 100644 --- a/content/en/llm_observability/experiments/advanced_runs.md +++ b/content/en/llm_observability/experiments/advanced_runs.md @@ -1,10 +1,22 @@ --- title: Advanced Experiment Runs -description: Run experiments multiple times to account for model variability and automate experiment execution in CI/CD pipelines. +description: Run experiments multiple times to account for model variability on a subset of the dataset, and automate experiment execution in CI/CD pipelines. --- This page discusses advanced topics in running experiments, including [multiple experiment runs](#multiple-runs) and [setting up experiments in CI/CD](#setting-up-your-experiment-in-cicd). +## Run an Experiment on a subset of the Dataset + +First, add tags to your dataset records. They can be a unique identifier (e.g `name:test_use_case_1`) or represent a property of the scenario (e.g `difficulty:hard`). + +Then, use the `tags` argument of `LLMObs.pull_dataset()` to filter down the dataset to the records you want to run an Experiment on. + +Example +``` +LLMObs.pull_dataset(dataset_name="my-dataset", tags=["env:prod", "version:1.0"]) +``` +Finally, run the Experiment as usual. + ## Multiple runs You can run the same experiment multiple times to account for model non-determinism. You can use the [LLM Observability Python SDK][1] or [Experiments API][2] to specify how many iterations to run; subsequently, each dataset record is executed that many times using the same tasks and evaluators. @@ -215,4 +227,4 @@ jobs: [1]: /llm_observability/instrumentation/sdk?tab=python [2]: /llm_observability/experiments/api -[3]: https://app.datadoghq.com/llm/experiments \ No newline at end of file +[3]: https://app.datadoghq.com/llm/experiments