diff --git a/docs/examples.md b/docs/examples.md index 045d2ae..4d7dca8 100644 --- a/docs/examples.md +++ b/docs/examples.md @@ -8,4 +8,5 @@ Generated documentation for real-world Nextflow pipelines. | rnaseq-nf | Simple RNA-seq pipeline | [HTML](examples/rnaseq-nf/index.html) | [Markdown](examples/rnaseq-nf/markdown/index.md) | [Table](examples/rnaseq-nf/table/README.md) | [JSON](examples/rnaseq-nf/json/pipeline-api.json) | [YAML](examples/rnaseq-nf/yaml/pipeline-api.yaml) | | rnavar | nf-core/rnavar — RNA variant calling | [HTML](examples/rnavar/index.html) | [Markdown](examples/rnavar/markdown/index.md) | [Table](examples/rnavar/table/README.md) | [JSON](examples/rnavar/json/pipeline-api.json) | [YAML](examples/rnavar/yaml/pipeline-api.yaml) | | sarek | nf-core/sarek — variant calling & annotation | [HTML](examples/sarek/index.html) | [Markdown](examples/sarek/markdown/index.md) | [Table](examples/sarek/table/README.md) | [JSON](examples/sarek/json/pipeline-api.json) | [YAML](examples/sarek/yaml/pipeline-api.yaml) | +| nf-fgsv | Fulcrum Genomics structural variant calling | [HTML](examples/nf-fgsv/index.html) | [Markdown](examples/nf-fgsv/markdown/index.md) | [Table](examples/nf-fgsv/table/README.md) | [JSON](examples/nf-fgsv/json/pipeline-api.json) | [YAML](examples/nf-fgsv/yaml/pipeline-api.yaml) | | wf-metagenomics | Oxford Nanopore metagenomics workflow | [HTML](examples/wf-metagenomics/index.html) | [Markdown](examples/wf-metagenomics/markdown/index.md) | [Table](examples/wf-metagenomics/table/README.md) | [JSON](examples/wf-metagenomics/json/pipeline-api.json) | [YAML](examples/wf-metagenomics/yaml/pipeline-api.yaml) | diff --git a/docs/examples/nf-fgsv/index.html b/docs/examples/nf-fgsv/index.html new file mode 100644 index 0000000..0b8b4e3 --- /dev/null +++ b/docs/examples/nf-fgsv/index.html @@ -0,0 +1,3218 @@ + + + + + + nf-fgsv workflow parameters + + + + + + + +
+ + + + + + + Organization avatar + +

nf-fgsv workflow parameters

+ + v0.1.0 + +
+ + +
+ + + + + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + + +
+ + + + +
+

nf-fgsv workflow parameters

+ + +

Nextflow workflow for running fgsv.

+ + + +
+
Note

This repository is primarily for testing the latest Nextflow features and the workflow is relatively simple.

+

This is a Nextflow workflow for running fgsv on a BAM file to gather evidence for structural variation via breakpoint detection.

+

Set up Environment

+

Make sure Pixi and Docker are installed.

+

The environment for this analysis is in pixi.toml and is named nf-fgsv.

+

To install:

+
pixi install
+
+ +

Run the Workflow

+

To save on typing, a pixi task is available which aliases nextflow run main.nf.

+
pixi run \
+    nf-workflow \
+        -profile "local,docker" \
+        --input tests/data/basic_input.tsv
+
+ +

Available Execution Profiles

+

Several default profiles are available:

+
    +
  • local limits the resources used by the workflow to (hopefully) reasonable levels
  • +
  • docker specifies docker containers should be used for process execution
  • +
  • linux adds --user root to docker runOptions
  • +
+

Inputs

+

A full description of input parameters is available using the workflow --help parameter, pixi run nf-workflow --help.

+

The required columns in the --input samplesheet are:

+ + + + + + + + + + + + + + + + + + + + +
FieldTypeDescription
sampleString, no whitespaceSample identifier
bamAbsolute pathPath to the BAM file for this sample (may be an AWS S3 path)
+

Parameter files

+

If using more than a few parameters, consider saving them in a YAML format file (e.g. tests/integration/params.yml).

+
pixi run \
+    nf-workflow \
+        -profile "local,docker" \
+        -params-file my_params.yml
+
+ +

Outputs

+

The output directory can be specified using the -output-dir Nextflow parameter. +The default output directory is results/. +-output-dir cannot be specified in a params.yml file, because it is a Nextflow parameter rather than a workflow parameter. +It must be specified on the command line or in a nextflow.config file.

+
pixi run \
+    nf-workflow \
+        -profile "local,docker" \
+        --input tests/data/basic_input.tsv \
+        -output-dir results
+
+ +
+Click to toggle output directory structure + + + +
results
+└── {sample_name}
+    ├── {sample_name}_sorted.bam                # Coordinate-sorted BAM file
+    ├── {sample_name}_svpileup.txt              # SvPileup breakpoint candidates
+    ├── {sample_name}_svpileup.bam              # BAM with SV tags from SvPileup
+    ├── {sample_name}_svpileup.aggregate.txt    # Aggregated/merged breakpoint pileups
+    └── {sample_name}_svpileup.aggregate.bedpe  # Aggregated pileups in BEDPE format
+
+ + + +
+ +

Output File Descriptions

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FileDescription
*_sorted.bamInput BAM sorted by genomic coordinates using samtools sort
*_svpileup.txtCandidate structural variant breakpoints identified by fgsv SvPileup
*_svpileup.bamBAM file with SV-related tags added by SvPileup
*_svpileup.aggregate.txtMerged breakpoint pileups from fgsv AggregateSvPileup
*_svpileup.aggregate.bedpeAggregated pileups converted to BEDPE format
+

Contributing

+

See our Contributing Guide for development and testing guidelines.

+
+ +

Authors

+
    + +
  • Fulcrum Genomics
  • + +
+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + \ No newline at end of file diff --git a/docs/examples/nf-fgsv/json/pipeline-api.json b/docs/examples/nf-fgsv/json/pipeline-api.json new file mode 100644 index 0000000..ed90cb0 --- /dev/null +++ b/docs/examples/nf-fgsv/json/pipeline-api.json @@ -0,0 +1,153 @@ +{ + "pipeline": { + "name": "nf-fgsv workflow parameters", + "description": "Nextflow workflow for running fgsv.", + "version": "0.1.0", + "repository": "https://github.com/fulcrumgenomics/nf-fgsv", + "authors": [ + "Fulcrum Genomics" + ], + "readme_content": "> [!NOTE] \n> This repository is primarily for testing the latest Nextflow features and the workflow is relatively simple.\n\nThis is a Nextflow workflow for running [fgsv](https://github.com/fulcrumgenomics/fgsv) on a BAM file to gather evidence for structural variation via breakpoint detection.\n\n## Set up Environment\n\nMake sure [Pixi](https://pixi.sh/latest/#installation) and [Docker](https://docs.docker.com/engine/install/) are installed.\n\nThe environment for this analysis is in [pixi.toml](pixi.toml) and is named `nf-fgsv`.\n\nTo install:\n\n```console\npixi install\n```\n\n## Run the Workflow\n\nTo save on typing, a pixi task is available which aliases `nextflow run main.nf`.\n\n```console\npixi run \\\n nf-workflow \\\n -profile \"local,docker\" \\\n --input tests/data/basic_input.tsv\n```\n\n## Available Execution Profiles\n\nSeveral default profiles are available:\n\n* `local` limits the resources used by the workflow to (hopefully) reasonable levels\n* `docker` specifies docker containers should be used for process execution\n* `linux` adds `--user root` to docker `runOptions`\n\n## Inputs\n\nA full description of input parameters is available using the workflow `--help` parameter, `pixi run nf-workflow --help`.\n\nThe required columns in the `--input` samplesheet are:\n\n| Field | Type | Description |\n|-------|------|-------------|\n| `sample` | String, no whitespace | Sample identifier |\n| `bam` | Absolute path | Path to the BAM file for this sample (may be an AWS S3 path) |\n\n### Parameter files\n\nIf using more than a few parameters, consider saving them in a YAML format file (e.g. [tests/integration/params.yml](tests/integration/params.yml)).\n\n```console\npixi run \\\n nf-workflow \\\n -profile \"local,docker\" \\\n -params-file my_params.yml\n```\n\n## Outputs\n\nThe output directory can be specified using the `-output-dir` Nextflow parameter.\nThe default output directory is `results/`.\n`-output-dir` cannot be specified in a `params.yml` file, because it is a Nextflow parameter rather than a workflow parameter.\nIt must be specified on the command line or in a `nextflow.config` file.\n\n```console\npixi run \\\n nf-workflow \\\n -profile \"local,docker\" \\\n --input tests/data/basic_input.tsv \\\n -output-dir results\n```\n\n
\nClick to toggle output directory structure\n\n\n```console\nresults\n└── {sample_name}\n ├── {sample_name}_sorted.bam # Coordinate-sorted BAM file\n ├── {sample_name}_svpileup.txt # SvPileup breakpoint candidates\n ├── {sample_name}_svpileup.bam # BAM with SV tags from SvPileup\n ├── {sample_name}_svpileup.aggregate.txt # Aggregated/merged breakpoint pileups\n └── {sample_name}_svpileup.aggregate.bedpe # Aggregated pileups in BEDPE format\n```\n\n
\n\n### Output File Descriptions\n\n| File | Description |\n|------|-------------|\n| `*_sorted.bam` | Input BAM sorted by genomic coordinates using samtools sort |\n| `*_svpileup.txt` | Candidate structural variant breakpoints identified by [fgsv SvPileup](https://github.com/fulcrumgenomics/fgsv) |\n| `*_svpileup.bam` | BAM file with SV-related tags added by SvPileup |\n| `*_svpileup.aggregate.txt` | Merged breakpoint pileups from [fgsv AggregateSvPileup](https://github.com/fulcrumgenomics/fgsv) |\n| `*_svpileup.aggregate.bedpe` | Aggregated pileups converted to [BEDPE format](https://bedtools.readthedocs.io/en/latest/content/general-usage.html#bedpe-format) |\n\n## Contributing\n\nSee our [Contributing Guide](docs/CONTRIBUTING.md) for development and testing guidelines.\n\n\n" + }, + "inputs": [ + { + "name": "input", + "type": "string", + "description": "Path to tab-separated file containing information about the samples in the experiment.", + "required": true, + "format": "file-path", + "pattern": ".*.tsv$", + "group": "Main workflow parameters" + } + ], + "config_params": [], + "workflows": [ + { + "name": "", + "docstring": "", + "file": "main.nf", + "line": 19, + "end_line": 42, + "inputs": [], + "outputs": [], + "calls": [], + "is_entry": true, + "source_url": "https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/main.nf#L19-L42" + } + ], + "processes": [ + { + "name": "AGGREGATE_SV_PILEUP_TO_BEDPE", + "docstring": "Convert aggregated SvPileup output to BEDPE format using fgsv\nAggregateSvPileupToBedPE.", + "file": "modules/aggregate_sv_pileup_to_bedpe.nf", + "line": 11, + "end_line": 31, + "inputs": [ + { + "name": "val(meta), path(txt)", + "type": "tuple", + "description": "`meta`: Map containing sample information (must include 'id'); `txt`: Aggregated SvPileup output file", + "qualifier": "Tuple" + } + ], + "outputs": [ + { + "name": "val(meta), file(\"*_svpileup.aggregate.bedpe\")", + "type": "tuple", + "description": "Tuple of meta and BEDPE format output file", + "emit": "bedpe" + } + ], + "directives": {}, + "source_url": "https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/aggregate_sv_pileup_to_bedpe.nf#L11-L31" + }, + { + "name": "AGGREGATE_SV_PILEUP", + "docstring": "Aggregate and merge pileups that are likely to support the same breakpoint\nusing fgsv AggregateSvPileup.", + "file": "modules/aggregate_sv_pileup.nf", + "line": 12, + "end_line": 33, + "inputs": [ + { + "name": "val(meta), path(bam), path(txt)", + "type": "tuple", + "description": "`meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file; `txt`: SvPileup breakpoint output file", + "qualifier": "Tuple" + } + ], + "outputs": [ + { + "name": "val(meta), file(\"*_svpileup.aggregate.txt\")", + "type": "tuple", + "description": "Tuple of meta and aggregated SvPileup output file", + "emit": "txt" + } + ], + "directives": {}, + "source_url": "https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/aggregate_sv_pileup.nf#L12-L33" + }, + { + "name": "COORDINATE_SORT", + "docstring": "Sort a BAM file by genomic coordinates using samtools sort.", + "file": "modules/coordinate_sort.nf", + "line": 10, + "end_line": 31, + "inputs": [ + { + "name": "val(meta), path(bam)", + "type": "tuple", + "description": "`meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file to be sorted", + "qualifier": "Tuple" + } + ], + "outputs": [ + { + "name": "val(meta), file(\"*_sorted.bam\")", + "type": "tuple", + "description": "Tuple of meta and coordinate-sorted BAM file", + "emit": "bam" + } + ], + "directives": {}, + "source_url": "https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/coordinate_sort.nf#L10-L31" + }, + { + "name": "SV_PILEUP", + "docstring": "Detect structural variant evidence from a BAM file using fgsv SvPileup.", + "file": "modules/sv_pileup.nf", + "line": 11, + "end_line": 33, + "inputs": [ + { + "name": "val(meta), path(bam)", + "type": "tuple", + "description": "`meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file", + "qualifier": "Tuple" + } + ], + "outputs": [ + { + "name": "val(meta), file(\"*_svpileup.txt\")", + "type": "tuple", + "description": "Tuple of meta and SvPileup breakpoint output file", + "emit": "txt" + }, + { + "name": "val(meta), file(\"*_svpileup.bam\")", + "type": "tuple", + "description": "Tuple of meta and SvPileup BAM file", + "emit": "bam" + } + ], + "directives": {}, + "source_url": "https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/sv_pileup.nf#L11-L33" + } + ], + "functions": [], + "generated_by": { + "nf_docs_version": "0.2.0", + "generated_at": "2026-03-03T19:13:49.416189+00:00", + "nextflow_url": "https://nextflow.io", + "nf_docs_url": "https://github.com/ewels/nf-docs" + } +} \ No newline at end of file diff --git a/docs/examples/nf-fgsv/markdown/index.md b/docs/examples/nf-fgsv/markdown/index.md new file mode 100644 index 0000000..88cd9cf --- /dev/null +++ b/docs/examples/nf-fgsv/markdown/index.md @@ -0,0 +1,30 @@ +# nf-fgsv workflow parameters + +**Version:** 0.1.0 + +Nextflow workflow for running fgsv. + +## Documentation + +- [Pipeline Inputs](inputs.md) - Input parameters and options +- [Workflows](workflows.md) - Pipeline workflows +- [Processes](processes.md) - Process definitions + +## Summary + +- **Inputs:** 1 parameters +- **Workflows:** 1 +- **Processes:** 4 + +## Authors + +- Fulcrum Genomics + +## Links + +- [Repository](https://github.com/fulcrumgenomics/nf-fgsv) + +--- + +*This pipeline was built with [Nextflow](https://nextflow.io). +Documentation generated by [nf-docs](https://github.com/ewels/nf-docs) v0.2.0 on 2026-03-03 19:14:06 UTC.* diff --git a/docs/examples/nf-fgsv/markdown/inputs.md b/docs/examples/nf-fgsv/markdown/inputs.md new file mode 100644 index 0000000..c3bb522 --- /dev/null +++ b/docs/examples/nf-fgsv/markdown/inputs.md @@ -0,0 +1,20 @@ +# Pipeline Inputs + +This page documents all input parameters for the pipeline. + +## Main workflow parameters + +### `--input` {#input} + +**Type:** `string` | **Required** | **Format:** `file-path` + +Path to tab-separated file containing information about the samples in the experiment. + +**Pattern:** `.*.tsv$` + + + +--- + +*This pipeline was built with [Nextflow](https://nextflow.io). +Documentation generated by [nf-docs](https://github.com/ewels/nf-docs) v0.2.0 on 2026-03-03 19:14:06 UTC.* diff --git a/docs/examples/nf-fgsv/markdown/processes.md b/docs/examples/nf-fgsv/markdown/processes.md new file mode 100644 index 0000000..693201f --- /dev/null +++ b/docs/examples/nf-fgsv/markdown/processes.md @@ -0,0 +1,94 @@ +# Processes + +This page documents all processes in the pipeline. + +## Contents + +- [AGGREGATE_SV_PILEUP_TO_BEDPE](#aggregate-sv-pileup-to-bedpe) +- [AGGREGATE_SV_PILEUP](#aggregate-sv-pileup) +- [COORDINATE_SORT](#coordinate-sort) +- [SV_PILEUP](#sv-pileup) + +## AGGREGATE_SV_PILEUP_TO_BEDPE {#aggregate-sv-pileup-to-bedpe} + +*Defined in `modules/aggregate_sv_pileup_to_bedpe.nf:11`* + +Convert aggregated SvPileup output to BEDPE format using fgsv +AggregateSvPileupToBedPE. + +### Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(txt)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `txt`: Aggregated SvPileup output file | + +### Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_svpileup.aggregate.bedpe")` | `tuple` | `bedpe` | Tuple of meta and BEDPE format output file | + + +## AGGREGATE_SV_PILEUP {#aggregate-sv-pileup} + +*Defined in `modules/aggregate_sv_pileup.nf:12`* + +Aggregate and merge pileups that are likely to support the same breakpoint +using fgsv AggregateSvPileup. + +### Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(bam), path(txt)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file; `txt`: SvPileup breakpoint output file | + +### Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_svpileup.aggregate.txt")` | `tuple` | `txt` | Tuple of meta and aggregated SvPileup output file | + + +## COORDINATE_SORT {#coordinate-sort} + +*Defined in `modules/coordinate_sort.nf:10`* + +Sort a BAM file by genomic coordinates using samtools sort. + +### Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(bam)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file to be sorted | + +### Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_sorted.bam")` | `tuple` | `bam` | Tuple of meta and coordinate-sorted BAM file | + + +## SV_PILEUP {#sv-pileup} + +*Defined in `modules/sv_pileup.nf:11`* + +Detect structural variant evidence from a BAM file using fgsv SvPileup. + +### Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(bam)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file | + +### Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_svpileup.txt")` | `tuple` | `txt` | Tuple of meta and SvPileup breakpoint output file | +| `val(meta), file("*_svpileup.bam")` | `tuple` | `bam` | Tuple of meta and SvPileup BAM file | + + +--- + +*This pipeline was built with [Nextflow](https://nextflow.io). +Documentation generated by [nf-docs](https://github.com/ewels/nf-docs) v0.2.0 on 2026-03-03 19:14:06 UTC.* diff --git a/docs/examples/nf-fgsv/markdown/workflows.md b/docs/examples/nf-fgsv/markdown/workflows.md new file mode 100644 index 0000000..a14b179 --- /dev/null +++ b/docs/examples/nf-fgsv/markdown/workflows.md @@ -0,0 +1,19 @@ +# Workflows + +This page documents all workflows in the pipeline. + +## Contents + +- [(entry)](#entry) *(entry point)* + +## (entry) {#entry} + +**Entry workflow** + +*Defined in `main.nf:19`* + + +--- + +*This pipeline was built with [Nextflow](https://nextflow.io). +Documentation generated by [nf-docs](https://github.com/ewels/nf-docs) v0.2.0 on 2026-03-03 19:14:06 UTC.* diff --git a/docs/examples/nf-fgsv/table/README.md b/docs/examples/nf-fgsv/table/README.md new file mode 100644 index 0000000..ab74682 --- /dev/null +++ b/docs/examples/nf-fgsv/table/README.md @@ -0,0 +1,83 @@ + +# nf-fgsv workflow parameters + +**Version:** 0.1.0 · Nextflow workflow for running fgsv. + +## Inputs + +### Main workflow parameters + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| `--input` | Path to tab-separated file containing information about the samples in the experiment. | `string` | n/a | yes | + +## Workflows + +| Name | Description | Entry | +|------|-------------|:-----:| +| *(entry)* | n/a | yes | + +## Processes + +| Name | Description | +|------|-------------| +| `AGGREGATE_SV_PILEUP_TO_BEDPE` | Convert aggregated SvPileup output to BEDPE format using fgsv AggregateSvPileupToBedPE. | +| `AGGREGATE_SV_PILEUP` | Aggregate and merge pileups that are likely to support the same breakpoint using fgsv AggregateSvPileup. | +| `COORDINATE_SORT` | Sort a BAM file by genomic coordinates using samtools sort. | +| `SV_PILEUP` | Detect structural variant evidence from a BAM file using fgsv SvPileup. | + +### `AGGREGATE_SV_PILEUP_TO_BEDPE` Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(txt)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `txt`: Aggregated SvPileup output file | + +### `AGGREGATE_SV_PILEUP_TO_BEDPE` Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_svpileup.aggregate.bedpe")` | `tuple` | `bedpe` | Tuple of meta and BEDPE format output file | + +### `AGGREGATE_SV_PILEUP` Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(bam), path(txt)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file; `txt`: SvPileup breakpoint output file | + +### `AGGREGATE_SV_PILEUP` Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_svpileup.aggregate.txt")` | `tuple` | `txt` | Tuple of meta and aggregated SvPileup output file | + +### `COORDINATE_SORT` Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(bam)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file to be sorted | + +### `COORDINATE_SORT` Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_sorted.bam")` | `tuple` | `bam` | Tuple of meta and coordinate-sorted BAM file | + +### `SV_PILEUP` Inputs + +| Name | Type | Description | +|------|------|-------------| +| `val(meta), path(bam)` | `tuple` | `meta`: Map containing sample information (must include 'id'); `bam`: Input BAM file | + +### `SV_PILEUP` Outputs + +| Name | Type | Emit | Description | +|------|------|------|-------------| +| `val(meta), file("*_svpileup.txt")` | `tuple` | `txt` | Tuple of meta and SvPileup breakpoint output file | +| `val(meta), file("*_svpileup.bam")` | `tuple` | `bam` | Tuple of meta and SvPileup BAM file | + +--- + +*This pipeline was built with [Nextflow](https://nextflow.io). +Documentation generated by [nf-docs](https://github.com/ewels/nf-docs) v0.2.0 on 2026-03-03 22:49:08 UTC.* + + diff --git a/docs/examples/nf-fgsv/yaml/pipeline-api.yaml b/docs/examples/nf-fgsv/yaml/pipeline-api.yaml new file mode 100644 index 0000000..b7a24db --- /dev/null +++ b/docs/examples/nf-fgsv/yaml/pipeline-api.yaml @@ -0,0 +1,144 @@ +pipeline: + name: nf-fgsv workflow parameters + description: Nextflow workflow for running fgsv. + version: 0.1.0 + repository: https://github.com/fulcrumgenomics/nf-fgsv + authors: + - Fulcrum Genomics + readme_content: "> [!NOTE] \n> This repository is primarily for testing the latest Nextflow features\ + \ and the workflow is relatively simple.\n\nThis is a Nextflow workflow for running [fgsv](https://github.com/fulcrumgenomics/fgsv)\ + \ on a BAM file to gather evidence for structural variation via breakpoint detection.\n\n## Set up\ + \ Environment\n\nMake sure [Pixi](https://pixi.sh/latest/#installation) and [Docker](https://docs.docker.com/engine/install/)\ + \ are installed.\n\nThe environment for this analysis is in [pixi.toml](pixi.toml) and is named `nf-fgsv`.\n\ + \nTo install:\n\n```console\npixi install\n```\n\n## Run the Workflow\n\nTo save on typing, a pixi\ + \ task is available which aliases `nextflow run main.nf`.\n\n```console\npixi run \\\n nf-workflow\ + \ \\\n -profile \"local,docker\" \\\n --input tests/data/basic_input.tsv\n```\n\n##\ + \ Available Execution Profiles\n\nSeveral default profiles are available:\n\n* `local` limits the\ + \ resources used by the workflow to (hopefully) reasonable levels\n* `docker` specifies docker containers\ + \ should be used for process execution\n* `linux` adds `--user root` to docker `runOptions`\n\n##\ + \ Inputs\n\nA full description of input parameters is available using the workflow `--help` parameter,\ + \ `pixi run nf-workflow --help`.\n\nThe required columns in the `--input` samplesheet are:\n\n| Field\ + \ | Type | Description |\n|-------|------|-------------|\n| `sample` | String, no whitespace | Sample\ + \ identifier |\n| `bam` | Absolute path | Path to the BAM file for this sample (may be an AWS S3 path)\ + \ |\n\n### Parameter files\n\nIf using more than a few parameters, consider saving them in a YAML\ + \ format file (e.g. [tests/integration/params.yml](tests/integration/params.yml)).\n\n```console\n\ + pixi run \\\n nf-workflow \\\n -profile \"local,docker\" \\\n -params-file my_params.yml\n\ + ```\n\n## Outputs\n\nThe output directory can be specified using the `-output-dir` Nextflow parameter.\n\ + The default output directory is `results/`.\n`-output-dir` cannot be specified in a `params.yml` file,\ + \ because it is a Nextflow parameter rather than a workflow parameter.\nIt must be specified on the\ + \ command line or in a `nextflow.config` file.\n\n```console\npixi run \\\n nf-workflow \\\n \ + \ -profile \"local,docker\" \\\n --input tests/data/basic_input.tsv \\\n -output-dir\ + \ results\n```\n\n
\nClick to toggle output directory structure\n\n\n```console\n\ + results\n└── {sample_name}\n ├── {sample_name}_sorted.bam # Coordinate-sorted BAM\ + \ file\n ├── {sample_name}_svpileup.txt # SvPileup breakpoint candidates\n ├──\ + \ {sample_name}_svpileup.bam # BAM with SV tags from SvPileup\n ├── {sample_name}_svpileup.aggregate.txt\ + \ # Aggregated/merged breakpoint pileups\n └── {sample_name}_svpileup.aggregate.bedpe # Aggregated\ + \ pileups in BEDPE format\n```\n\n
\n\n### Output File Descriptions\n\n| File | Description\ + \ |\n|------|-------------|\n| `*_sorted.bam` | Input BAM sorted by genomic coordinates using samtools\ + \ sort |\n| `*_svpileup.txt` | Candidate structural variant breakpoints identified by [fgsv SvPileup](https://github.com/fulcrumgenomics/fgsv)\ + \ |\n| `*_svpileup.bam` | BAM file with SV-related tags added by SvPileup |\n| `*_svpileup.aggregate.txt`\ + \ | Merged breakpoint pileups from [fgsv AggregateSvPileup](https://github.com/fulcrumgenomics/fgsv)\ + \ |\n| `*_svpileup.aggregate.bedpe` | Aggregated pileups converted to [BEDPE format](https://bedtools.readthedocs.io/en/latest/content/general-usage.html#bedpe-format)\ + \ |\n\n## Contributing\n\nSee our [Contributing Guide](docs/CONTRIBUTING.md) for development and testing\ + \ guidelines.\n\n\n" +inputs: +- name: input + type: string + description: Path to tab-separated file containing information about the samples in the experiment. + required: true + format: file-path + pattern: .*.tsv$ + group: Main workflow parameters +config_params: [] +workflows: +- name: '' + docstring: '' + file: main.nf + line: 19 + end_line: 42 + inputs: [] + outputs: [] + calls: [] + is_entry: true + source_url: https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/main.nf#L19-L42 +processes: +- name: AGGREGATE_SV_PILEUP_TO_BEDPE + docstring: 'Convert aggregated SvPileup output to BEDPE format using fgsv + + AggregateSvPileupToBedPE.' + file: modules/aggregate_sv_pileup_to_bedpe.nf + line: 11 + end_line: 31 + inputs: + - name: val(meta), path(txt) + type: tuple + description: '`meta`: Map containing sample information (must include ''id''); `txt`: Aggregated SvPileup + output file' + qualifier: Tuple + outputs: + - name: val(meta), file("*_svpileup.aggregate.bedpe") + type: tuple + description: Tuple of meta and BEDPE format output file + emit: bedpe + directives: {} + source_url: https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/aggregate_sv_pileup_to_bedpe.nf#L11-L31 +- name: AGGREGATE_SV_PILEUP + docstring: 'Aggregate and merge pileups that are likely to support the same breakpoint + + using fgsv AggregateSvPileup.' + file: modules/aggregate_sv_pileup.nf + line: 12 + end_line: 33 + inputs: + - name: val(meta), path(bam), path(txt) + type: tuple + description: '`meta`: Map containing sample information (must include ''id''); `bam`: Input BAM file; + `txt`: SvPileup breakpoint output file' + qualifier: Tuple + outputs: + - name: val(meta), file("*_svpileup.aggregate.txt") + type: tuple + description: Tuple of meta and aggregated SvPileup output file + emit: txt + directives: {} + source_url: https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/aggregate_sv_pileup.nf#L12-L33 +- name: COORDINATE_SORT + docstring: Sort a BAM file by genomic coordinates using samtools sort. + file: modules/coordinate_sort.nf + line: 10 + end_line: 31 + inputs: + - name: val(meta), path(bam) + type: tuple + description: '`meta`: Map containing sample information (must include ''id''); `bam`: Input BAM file + to be sorted' + qualifier: Tuple + outputs: + - name: val(meta), file("*_sorted.bam") + type: tuple + description: Tuple of meta and coordinate-sorted BAM file + emit: bam + directives: {} + source_url: https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/coordinate_sort.nf#L10-L31 +- name: SV_PILEUP + docstring: Detect structural variant evidence from a BAM file using fgsv SvPileup. + file: modules/sv_pileup.nf + line: 11 + end_line: 33 + inputs: + - name: val(meta), path(bam) + type: tuple + description: '`meta`: Map containing sample information (must include ''id''); `bam`: Input BAM file' + qualifier: Tuple + outputs: + - name: val(meta), file("*_svpileup.txt") + type: tuple + description: Tuple of meta and SvPileup breakpoint output file + emit: txt + - name: val(meta), file("*_svpileup.bam") + type: tuple + description: Tuple of meta and SvPileup BAM file + emit: bam + directives: {} + source_url: https://github.com/fulcrumgenomics/nf-fgsv/blob/73e7adf5d96a2031f89519bbb378d8e19954aeb7/modules/sv_pileup.nf#L11-L33 +functions: [] diff --git a/src/nf_docs/extractor.py b/src/nf_docs/extractor.py index fed260c..e4224df 100644 --- a/src/nf_docs/extractor.py +++ b/src/nf_docs/extractor.py @@ -38,7 +38,13 @@ WorkflowInput, WorkflowOutput, ) -from nf_docs.nf_parser import parse_process_hover, parse_workflow_hover +from nf_docs.nf_parser import ( + RETURN_KEY_PREFIX, + RETURN_KEY_UNNAMED, + enrich_outputs_from_source, + parse_process_hover, + parse_workflow_hover, +) from nf_docs.progress import ( ExtractionPhase, ProgressCallbackType, @@ -586,7 +592,15 @@ def _process_symbol( # Determine what kind of symbol this is based on parsed type or LSP kind if symbol_type == "process" or kind == SymbolKind.METHOD: process = self._create_process_from_signature( - name, signature, docstring, relative_path, line + 1, end_line + 1, source_url + name, + signature, + docstring, + relative_path, + line + 1, + end_line + 1, + source_url, + source_path=file_path, + param_docs=param_docs, ) if process and not any(p.name == process.name for p in pipeline.processes): # Apply meta.yml data if available (for modules) @@ -638,11 +652,40 @@ def _create_process_from_signature( line: int, end_line: int = 0, source_url: str = "", + source_path: Path | None = None, + param_docs: dict[str, str] | None = None, ) -> Process | None: - """Create a Process by parsing the LSP signature.""" + """Create a Process by parsing the LSP signature. + + When the LSP hover includes Groovydoc ``@param`` / ``@return`` tags, + their descriptions are applied to the matching inputs and outputs. + If the LSP returns no docstring (common with typed Nextflow syntax), + the Groovydoc is parsed directly from the ``.nf`` source file. + """ + if param_docs is None: + param_docs = {} + + # Read the source file once — shared by Groovydoc parsing and output enrichment. + source_text: str | None = None + if source_path: + try: + source_text = source_path.read_text(encoding="utf-8") + except OSError as e: + logger.debug(f"Could not read source file {source_path}: {e}") + + # If the LSP returned no param docs, try to parse them from the source file. + # The LSP may return the free-text docstring but strip @param/@return tags, + # or (for typed processes) return no docstring at all. + if source_text is not None and not param_docs: + source_docstring, source_params = _parse_groovydoc_from_source(source_text, name) + if source_params: + param_docs = source_params + if not docstring and source_docstring: + docstring = source_docstring + process = Process( name=name, - docstring=docstring, # Actual Groovydoc documentation + docstring=docstring, file=file_path, line=line, end_line=end_line, @@ -650,22 +693,38 @@ def _create_process_from_signature( ) # Use the nf_parser to extract inputs/outputs from the signature - # The signature is in the format: - # process NAME { - # input: - # tuple val(meta), path(reads) - # - # output: - # tuple val(meta), path("*.html"), emit: html - # } parsed = parse_process_hover(f"```nextflow\n{signature}\n```") if parsed: + # Enrich bare-name outputs from the source file when the LSP + # only returns emit names (common with typed Nextflow syntax) + outputs = parsed.outputs + if source_text is not None and any(not o.type for o in outputs): + outputs = enrich_outputs_from_source(outputs, source_text, name) + for inp in parsed.inputs: + # Look up @param description by matching input component names + description = _find_param_description(inp.name, param_docs) process.inputs.append( - ProcessInput(name=inp.name, type=inp.type, qualifier=inp.qualifier) + ProcessInput( + name=inp.name, + type=inp.type, + qualifier=inp.qualifier, + description=description, + ) + ) + for out in outputs: + # Look up @return description by emit name + description = param_docs.get(f"{RETURN_KEY_PREFIX}{out.emit}", "") + if not description: + description = param_docs.get(RETURN_KEY_UNNAMED, "") + process.outputs.append( + ProcessOutput( + name=out.name, + type=out.type, + emit=out.emit, + description=description, + ) ) - for out in parsed.outputs: - process.outputs.append(ProcessOutput(name=out.name, type=out.type, emit=out.emit)) return process @@ -728,7 +787,7 @@ def _create_function_from_signature( line=line, end_line=end_line, source_url=source_url, - return_description=param_docs.get("_return", ""), + return_description=param_docs.get(RETURN_KEY_UNNAMED, ""), ) # Parse function signature: def name(param1: Type, param2: Type) -> ReturnType @@ -759,3 +818,186 @@ def _create_function_from_signature( ) return function + + +def _find_param_description(input_name: str, param_docs: dict[str, str]) -> str: + """Find a ``@param`` description that matches a parsed input name. + + Input names may be composite (e.g. ``val(meta), path(bam)``) so we check + each component name against the ``param_docs`` keys. For a tuple input + the descriptions of all matching components are joined. + + Args: + input_name: The parsed input name (e.g. ``"val(meta), path(bam)"`` or ``"reads"``). + param_docs: Dict mapping param names to their Groovydoc descriptions. + + Returns: + The matching description, or an empty string if none found. + """ + if not param_docs: + return "" + + # Direct match (simple inputs like "reads") + if input_name in param_docs: + return param_docs[input_name] + + # For composite/tuple inputs, extract component names and look them up + # Matches val(meta), path(bam), file(reads), env(x), etc. + component_names = re.findall(r"(?:val|path|file|env)\((\w+)\)", input_name) + if not component_names: + return "" + + descriptions = [] + for comp_name in component_names: + if comp_name in param_docs: + descriptions.append(f"`{comp_name}`: {param_docs[comp_name]}") + + return "; ".join(descriptions) if descriptions else "" + + +def _parse_groovydoc_from_source( + source: str, + process_name: str, +) -> tuple[str, dict[str, str]]: + """Parse a Groovydoc comment from ``.nf`` source text. + + When the Nextflow LSP does not return a docstring (common with typed + processes), this function extracts the ``/** ... */`` comment block + preceding the process definition. + + Supports two documentation styles: + + 1. Standard Groovydoc ``@param`` / ``@return`` tags:: + + /** @param meta Sample metadata map + * @return txt Output text file + */ + + 2. Bullet-list ``Inputs:`` / ``Outputs:`` sections:: + + /** Inputs: + * - - meta: sample metadata map + * - bam: input BAM file + * Outputs: + * - - txt: output text file + */ + + Args: + source: The ``.nf`` source file contents. + process_name: Name of the process to locate. + + Returns: + Tuple of ``(docstring, param_docs)`` where *docstring* is the free-text + description and *param_docs* maps param/return names to descriptions. + Returns ``("", {})`` if no Groovydoc is found. + """ + + # Find the process declaration, then look backwards for the nearest /** ... */ + proc_pattern = re.compile( + r"process\s+" + re.escape(process_name) + r"\s*\{", + ) + proc_match = proc_pattern.search(source) + if not proc_match: + return "", {} + + # Search the text before the process for the last /** ... */ comment + preceding = source[: proc_match.start()] + # Find all /** ... */ blocks and take the last one (closest to the process) + comment_matches = list(re.finditer(r"/\*\*(.*?)\*/", preceding, re.DOTALL)) + if not comment_matches: + return "", {} + + comment_body = comment_matches[-1].group(1) + return _parse_groovydoc_comment(comment_body) + + +def _parse_groovydoc_comment(comment_body: str) -> tuple[str, dict[str, str]]: + """Parse the body of a ``/** ... */`` Groovydoc comment. + + Handles both ``@param``/``@return`` tags and ``Inputs:``/``Outputs:`` + bullet-list sections. + + Args: + comment_body: The text between ``/**`` and ``*/``. + + Returns: + Tuple of ``(docstring, param_docs)``. + """ + lines = comment_body.split("\n") + doc_lines: list[str] = [] + params: dict[str, str] = {} + current_section = "description" + current_param = "" + + for raw_line in lines: + # Strip leading whitespace and * characters (Groovydoc format) + line = raw_line.strip() + if line.startswith("*"): + line = line[1:].strip() + + if not line: + if current_section == "description" and doc_lines: + doc_lines.append("") # preserve paragraph breaks + continue + + # --- Standard @param / @return tags --- + + # Check for @param tag + param_match = re.match(r"@param\s+(\w+)\s*(.*)", line) + if param_match: + current_section = "param" + current_param = param_match.group(1) + params[current_param] = param_match.group(2).strip() + continue + + # Check for @return tag (named: @return name desc) + return_named = re.match(r"@returns?\s+(\w+)\s+(.*)", line) + if return_named: + current_section = "return" + current_param = f"{RETURN_KEY_PREFIX}{return_named.group(1)}" + params[current_param] = return_named.group(2).strip() + continue + + # Check for @return tag (unnamed: @return desc) + return_unnamed = re.match(r"@returns?\s*(.*)", line) + if return_unnamed: + current_section = "return" + current_param = RETURN_KEY_UNNAMED + params[RETURN_KEY_UNNAMED] = return_unnamed.group(1).strip() + continue + + # --- Bullet-list Inputs: / Outputs: sections --- + + # Check for section headers + if re.match(r"Inputs?:", line, re.IGNORECASE): + current_section = "input_bullets" + continue + if re.match(r"Outputs?:", line, re.IGNORECASE): + current_section = "output_bullets" + continue + + # Check for bullet items: "- name: description" or "- - name: description" + bullet_match = re.match(r"-\s*-?\s*(\w+)\s*:\s*(.*)", line) + if bullet_match and current_section in ("input_bullets", "output_bullets"): + name = bullet_match.group(1) + desc = bullet_match.group(2).strip() + if current_section == "input_bullets": + current_param = name + params[name] = desc + else: + current_param = f"{RETURN_KEY_PREFIX}{name}" + params[current_param] = desc + continue + + # Handle continuation lines + if current_section == "description": + # Skip the process name if it's the first line (e.g. "/** SV_PILEUP") + if not doc_lines and re.match(r"^[A-Z_0-9]+$", line): + continue + doc_lines.append(line) + elif current_section in ("param", "return", "input_bullets", "output_bullets"): + if current_param and line and not line.startswith("@"): + params[current_param] += " " + line + + docstring = "\n".join(doc_lines).strip() + return docstring, params diff --git a/src/nf_docs/lsp_client.py b/src/nf_docs/lsp_client.py index 8f251dd..618afca 100644 --- a/src/nf_docs/lsp_client.py +++ b/src/nf_docs/lsp_client.py @@ -24,6 +24,7 @@ import httpx from nf_docs.nextflow_env import get_isolated_env +from nf_docs.nf_parser import RETURN_KEY_PREFIX, RETURN_KEY_UNNAMED from nf_docs.progress import ( ExtractionPhase, ProgressCallbackType, @@ -872,18 +873,27 @@ def parse_hover_content(hover: dict[str, Any] | None) -> tuple[str, str, dict[st params[current_param] = param_match.group(2).strip() continue - # Check for @return tag - return_match = re.match(r"@returns?\s*(.*)", line_stripped) + # Check for @return tag (supports both named and unnamed forms) + # Named: @return txt Description of output + # Unnamed: @return Description of single return value + return_match = re.match(r"@returns?\s+(\w+)\s+(.*)", line_stripped) if return_match: current_section = "return" - params["_return"] = return_match.group(1).strip() + current_param = f"{RETURN_KEY_PREFIX}{return_match.group(1)}" + params[current_param] = return_match.group(2).strip() + continue + return_unnamed = re.match(r"@returns?\s*(.*)", line_stripped) + if return_unnamed: + current_section = "return" + current_param = RETURN_KEY_UNNAMED + params[RETURN_KEY_UNNAMED] = return_unnamed.group(1).strip() continue # Handle continuation lines if current_section == "description": if line_stripped and not line_stripped.startswith("@"): doc_lines.append(line_stripped) - elif current_section == "param" and current_param: + elif current_section in ("param", "return") and current_param: if line_stripped and not line_stripped.startswith("@"): params[current_param] += " " + line_stripped diff --git a/src/nf_docs/nf_parser.py b/src/nf_docs/nf_parser.py index 1701879..a91ebb8 100644 --- a/src/nf_docs/nf_parser.py +++ b/src/nf_docs/nf_parser.py @@ -5,9 +5,29 @@ to extract inputs, outputs, and other metadata. """ +import logging import re from dataclasses import dataclass, field +# Traditional Nextflow input qualifiers that should not be treated as typed variable names. +# Used to prevent false matches like "each: sample" being parsed as typed input "each" of type "sample". +_TRADITIONAL_QUALIFIERS = frozenset({"val", "path", "file", "env", "stdin", "tuple", "each"}) + +# Key prefix used in param_docs dicts to distinguish @return entries from @param entries. +RETURN_KEY_PREFIX = "_return_" +RETURN_KEY_UNNAMED = "_return" + +# Regexes for extracting input/output sections from a process body. +# The terminators cover every Nextflow process section keyword. +_INPUT_SECTION_RE = re.compile( + r"input:\s*(.*?)(?:output:|topic:|script:|shell:|exec:|\})", re.DOTALL +) +_OUTPUT_SECTION_RE = re.compile( + r"output:\s*(.*?)(?:topic:|script:|shell:|exec:|when:|\})", re.DOTALL +) + +logger = logging.getLogger(__name__) + @dataclass class ParsedInput: @@ -84,13 +104,13 @@ def parse_process_hover(hover_text: str) -> ParsedProcess | None: process = ParsedProcess(name=name_match.group(1)) # Extract input section - input_match = re.search(r"input:\s*(.*?)(?:output:|script:|shell:|exec:|\})", code, re.DOTALL) + input_match = _INPUT_SECTION_RE.search(code) if input_match: input_block = input_match.group(1) process.inputs = _parse_input_declarations(input_block) # Extract output section - output_match = re.search(r"output:\s*(.*?)(?:script:|shell:|exec:|when:|\})", code, re.DOTALL) + output_match = _OUTPUT_SECTION_RE.search(code) if output_match: output_block = output_match.group(1) process.outputs = _parse_output_declarations(output_block) @@ -118,11 +138,49 @@ def _parse_input_declarations(block: str) -> list[ParsedInput]: def _parse_single_input(line: str) -> ParsedInput | None: - """Parse a single input declaration line.""" + """Parse a single input declaration line. + + Handles both traditional DSL2 syntax and typed syntax: + - Traditional: ``tuple val(meta), path(reads)`` + - Typed tuple: ``(meta, bam): Tuple`` + - Typed simple: ``x: Integer``, ``bam: Path`` + """ line = line.strip() if not line: return None + # Handle typed destructured tuple: (meta, bam): Tuple + typed_tuple_match = re.match(r"\(([^)]+)\)\s*:\s*Tuple\s*<([^>]+)>", line) + if typed_tuple_match: + names_str = typed_tuple_match.group(1) + types_str = typed_tuple_match.group(2) + names = [n.strip() for n in names_str.split(",")] + types = [t.strip() for t in types_str.split(",")] + # Build descriptive components pairing names with their types + parts = [] + for i, name in enumerate(names): + if i < len(types): + type_name = types[i] + # Map Nextflow types to traditional qualifiers for display + qualifier = _typed_to_qualifier(type_name) + parts.append(f"{qualifier}({name})") + else: + parts.append(f"val({name})") + return ParsedInput( + name=", ".join(parts), + type="tuple", + qualifier=f"Tuple<{types_str}>", + ) + + # Handle typed simple input: x: Integer, bam: Path + # Exclude traditional Nextflow qualifiers that could false-match (e.g. "each: sample") + typed_simple_match = re.match(r"(\w+)\s*:\s*(\w+)$", line) + if typed_simple_match and typed_simple_match.group(1) not in _TRADITIONAL_QUALIFIERS: + name = typed_simple_match.group(1) + type_name = typed_simple_match.group(2) + qualifier = _typed_to_qualifier(type_name) + return ParsedInput(name=name, type=qualifier, qualifier=type_name) + # Handle tuple inputs: tuple val(meta), path(reads) if line.startswith("tuple"): # Extract components @@ -154,6 +212,24 @@ def _parse_single_input(line: str) -> ParsedInput | None: return None +def _typed_to_qualifier(type_name: str) -> str: + """Map a Nextflow type annotation to a traditional qualifier name. + + The Nextflow LSP may represent unknown or unresolved types as ``?``. + + Args: + type_name: The type annotation (e.g. ``Path``, ``Map``, ``Integer``, ``?``). + + Returns: + The corresponding traditional qualifier (``val``, ``path``, etc.). + """ + path_types = {"Path", "File"} + if type_name in path_types: + return "path" + # Everything else maps to val (Map, Integer, String, Boolean, ?, etc.) + return "val" + + def _parse_output_declarations(block: str) -> list[ParsedOutput]: """Parse output declarations from an output block.""" outputs = [] @@ -174,11 +250,30 @@ def _parse_output_declarations(block: str) -> list[ParsedOutput]: def _parse_single_output(line: str) -> ParsedOutput | None: - """Parse a single output declaration line.""" + """Parse a single output declaration line. + + Handles both traditional DSL2 syntax and typed/named assignment syntax: + - Traditional: ``tuple val(meta), path("*.html"), emit: html`` + - Named assignment: ``txt = tuple(meta, file("*_svpileup.txt"))`` + - Named simple: ``bam = file("*.bam")`` + - Topic publish: ``>> 'topic_name'`` + """ line = line.strip() if not line: return None + # Skip topic publish lines: >> 'topic_name' + if line.startswith(">>"): + return None + + # Handle named assignment outputs: name = tuple(meta, file("*_svpileup.txt")) + # or: name = file("*.bam"), name = val(something) + named_match = re.match(r"(\w+)\s*=\s*(.+)", line) + if named_match: + emit_name = named_match.group(1) + rhs = named_match.group(2).strip() + return _parse_named_output(emit_name, rhs) + # Extract emit name if present emit_match = re.search(r"emit:\s*(\w+)", line) emit_name = emit_match.group(1) if emit_match else "" @@ -205,9 +300,92 @@ def _parse_single_output(line: str) -> ParsedOutput | None: ) return ParsedOutput(name=name, type=output_type, emit=emit_name, optional=optional) + # Handle bare emit names from typed outputs (LSP returns just the name, e.g. "txt", "bam") + bare_name_match = re.match(r"(\w+)$", line) + if bare_name_match: + name = bare_name_match.group(1) + return ParsedOutput(name=name, type="", emit=name) + return None +def _parse_tuple_components(inner: str) -> list[str]: + """Parse the components inside a ``tuple(...)`` expression. + + Handles mixed content where some elements are qualified + (``file("...")``, ``val(x)``, ``path("...")``) and others are bare + names (``meta``, ``bam``). Bare names are wrapped as ``val(name)``. + + Args: + inner: The content inside the outer ``tuple()`` parentheses, + e.g. ``'meta, file("*_svpileup.txt")'``. + + Returns: + List of formatted component strings like ``["val(meta)", "file(\\"*_svpileup.txt\\")"]``. + """ + parts: list[str] = [] + # Match qualified components and their positions + qualifier_re = re.compile(r"(val|path|file|env)\s*\(([^)]+)\)") + last_end = 0 + + for m in qualifier_re.finditer(inner): + # Any text between the last match end and this match start may contain bare names + gap = inner[last_end : m.start()] + for bare in re.findall(r"\b(\w+)\b", gap): + parts.append(f"val({bare})") + parts.append(f"{m.group(1)}({m.group(2)})") + last_end = m.end() + + # Handle any trailing bare names after the last qualified component + gap = inner[last_end:] + for bare in re.findall(r"\b(\w+)\b", gap): + parts.append(f"val({bare})") + + # Fallback: if nothing was parsed, split by comma and wrap as val() + if not parts: + names = [n.strip() for n in inner.split(",")] + parts = [f"val({n})" for n in names if n] + + return parts + + +def _parse_named_output(emit_name: str, rhs: str) -> ParsedOutput: + """Parse the right-hand side of a named output assignment. + + Handles patterns like: + - ``tuple(meta, file("*_svpileup.txt"))`` + - ``file("*.bam")`` + - ``val(something)`` + - ``path("versions.yml")`` + + Args: + emit_name: The variable name (used as the emit name). + rhs: The right-hand side expression after ``=``. + + Returns: + A ParsedOutput with the emit name and parsed type/components. + """ + rhs = rhs.strip() + + # Handle tuple(...): tuple(meta, file("*_svpileup.txt")) + tuple_match = re.match(r"tuple\s*\((.+)\)\s*$", rhs) + if tuple_match: + inner = tuple_match.group(1) + parts = _parse_tuple_components(inner) + name = ", ".join(parts) + return ParsedOutput(name=name, type="tuple", emit=emit_name) + + # Handle simple: file("*.bam"), path("versions.yml"), val(x) + simple_match = re.match(r"(val|path|file|env|stdout)\s*\(([^)]*)\)", rhs) + if simple_match: + output_type = simple_match.group(1) + name = simple_match.group(2).strip().strip("'\"") if simple_match.group(2) else output_type + return ParsedOutput(name=name, type=output_type, emit=emit_name) + + # Fallback: treat entire rhs as the name + return ParsedOutput(name=rhs, type="val", emit=emit_name) + + def parse_workflow_hover(hover_text: str) -> ParsedWorkflow | None: """ Parse a workflow definition from LSP hover content. @@ -284,3 +462,79 @@ def is_code_block(text: str) -> bool: return True return False + + +def enrich_outputs_from_source( + outputs: list[ParsedOutput], + source: str, + process_name: str, +) -> list[ParsedOutput]: + """Enrich bare-name outputs with full declarations parsed from ``.nf`` source text. + + When the Nextflow LSP returns bare emit names for typed process outputs + (e.g. ``txt``, ``bam`` instead of ``txt = tuple(meta, file("*.txt"))``), + this function parses the source text to find the named-assignment + output declarations and provide full type and component information. + + If a bare output's emit name matches a named assignment in the source, the + bare output is replaced with the richer parsed version. Outputs that are + already fully qualified (have a non-empty ``type``) are left untouched. + + Args: + outputs: The outputs parsed from the LSP hover (may contain bare names). + source: The ``.nf`` source file contents. + process_name: Name of the process to locate in the source. + + Returns: + A new list of ``ParsedOutput`` objects, enriched where possible. + """ + # Only enrich if there are bare outputs (type is empty) + if not any(not o.type for o in outputs): + return outputs + + # Find the process block in source + proc_pattern = re.compile(rf"process\s+{re.escape(process_name)}\s*\{{", re.MULTILINE) + proc_match = proc_pattern.search(source) + if not proc_match: + logger.debug(f"Could not find process {process_name} in source") + return outputs + + # Extract the process body (find matching closing brace) + proc_start = proc_match.start() + brace_depth = 0 + proc_body = "" + for i in range(proc_match.end() - 1, len(source)): + if source[i] == "{": + brace_depth += 1 + elif source[i] == "}": + brace_depth -= 1 + if brace_depth == 0: + proc_body = source[proc_start : i + 1] + break + + if not proc_body: + return outputs + + # Extract the output section from the source process body + output_match = _OUTPUT_SECTION_RE.search(proc_body) + if not output_match: + return outputs + + # Parse the source output declarations + source_outputs = _parse_output_declarations(output_match.group(1)) + + # Build a lookup by emit name + source_by_emit: dict[str, ParsedOutput] = {} + for out in source_outputs: + if out.emit: + source_by_emit[out.emit] = out + + # Replace bare outputs with enriched versions + enriched = [] + for out in outputs: + if not out.type and out.emit in source_by_emit: + enriched.append(source_by_emit[out.emit]) + else: + enriched.append(out) + + return enriched diff --git a/src/nf_docs/renderers/html.py b/src/nf_docs/renderers/html.py index 047ecd9..cd36de8 100644 --- a/src/nf_docs/renderers/html.py +++ b/src/nf_docs/renderers/html.py @@ -254,12 +254,20 @@ def _slugify(text: str) -> str: return slug.strip("-") -def _add_heading_anchors(html: str) -> str: - """ - Add IDs and anchor links to headings in HTML. +def _add_heading_anchors(html: str, prefix: str = "") -> str: + """Add IDs and anchor links to headings in HTML. + + Transforms ``

Title

`` to:: + +

+ Title +

- Transforms

Title

to: -

Title

+ Args: + html: The HTML string to process. + prefix: Optional prefix for generated IDs (e.g. ``"readme-"``). + Used to avoid collisions with page-level section IDs when + the headings come from embedded content like a README. """ heading_pattern = re.compile(r"<(h[1-6])>(.+?)", re.IGNORECASE | re.DOTALL) @@ -271,7 +279,7 @@ def replace_heading(match: re.Match[str]) -> str: # Extract text content for the slug (strip any HTML tags) text_content = re.sub(r"<[^>]+>", "", content) - slug = _slugify(text_content) + slug = f"{prefix}{_slugify(text_content)}" # Handle duplicate IDs by appending a number if slug in seen_ids: @@ -289,12 +297,18 @@ def replace_heading(match: re.Match[str]) -> str: return heading_pattern.sub(replace_heading, html) -def md_to_html(text: str | None, add_anchors: bool = False) -> Markup: +def md_to_html( + text: str | None, + add_anchors: bool = False, + heading_id_prefix: str = "", +) -> Markup: """Convert markdown text to HTML, safe for Jinja templates. Args: - text: Markdown text to convert - add_anchors: If True, add IDs and anchor links to headings + text: Markdown text to convert. + add_anchors: If True, add IDs and anchor links to headings. + heading_id_prefix: Prefix for heading IDs (e.g. ``"readme-"``). + Only used when *add_anchors* is True. """ if not text: return Markup("") @@ -321,13 +335,17 @@ def md_to_html(text: str | None, add_anchors: bool = False) -> Markup: html = _process_github_alerts(html) # Optionally add heading anchors if add_anchors: - html = _add_heading_anchors(html) + html = _add_heading_anchors(html, prefix=heading_id_prefix) return Markup(html) def md_to_html_with_anchors(text: str | None) -> Markup: - """Convert markdown text to HTML with heading anchors.""" - return md_to_html(text, add_anchors=True) + """Convert markdown text to HTML with heading anchors. + + README heading IDs are prefixed with ``readme-`` to avoid collisions + with page-level section IDs (e.g. ``inputs``, ``processes``). + """ + return md_to_html(text, add_anchors=True, heading_id_prefix="readme-") def add_word_breaks(text: str | None) -> Markup: diff --git a/src/nf_docs/templates/html.html b/src/nf_docs/templates/html.html index f483d52..c0f58b1 100644 --- a/src/nf_docs/templates/html.html +++ b/src/nf_docs/templates/html.html @@ -459,6 +459,10 @@ background-color: rgba(239, 68, 68, 0.2); /* red with opacity */ color: rgb(252 165 165); /* red-300 */ } + html.dark .bg-yellow-100 { + background-color: rgba(234, 179, 8, 0.2); /* yellow with opacity */ + color: rgb(253 224 71); /* yellow-300 */ + } html.dark .bg-green-100 { background-color: rgba(34, 197, 94, 0.2); /* green with opacity */ color: rgb(134 239 172); /* green-300 */ @@ -1472,6 +1476,12 @@

Pipeline Inputs

> {{ inp.format }} + {% endif %} {% if inp.pattern %} + + Pattern: {{ inp.pattern }} + {% endif %} {% if inp.required %} Inputs (take) >{{ inp.name|wbr }} - {{ inp.description or '-' }} + + {% if inp.description %} +
{{ inp.description|markdown }}
+ {% else %}-{% endif %} + {% endfor %} {% endif %} @@ -1708,7 +1722,11 @@

Outputs (emit)

>{{ out.name|wbr }} - {{ out.description or '-' }} + + {% if out.description %} +
{{ out.description|markdown }}
+ {% else %}-{% endif %} + {% endfor %} {% endif %} @@ -1964,7 +1982,11 @@

Inputs

>{{ inp.type|wbr }} - {{ inp.description or '-' }} + + {% if inp.description %} +
{{ inp.description|markdown }}
+ {% else %}-{% endif %} + {% endfor %} {% endif %} @@ -2040,7 +2062,11 @@

Outputs

> {% else %}-{% endif %} - {{ out.description or '-' }} + + {% if out.description %} +
{{ out.description|markdown }}
+ {% else %}-{% endif %} + {% endfor %} {% endif %} @@ -2182,7 +2208,11 @@

Parameters

>{{ p.name|wbr }} - {{ p.description or '-' }} + + {% if p.description %} +
{{ p.description|markdown }}
+ {% else %}-{% endif %} + {% if p.default is not none %} :not(:last-child)){--tw-space-y-reverse:0;margin-block-start:calc(calc(var(--spacing)*.5)*var(--tw-space-y-reverse));margin-block-end:calc(calc(var(--spacing)*.5)*calc(1 - var(--tw-space-y-reverse)))}:where(.space-y-1>:not(:last-child)){--tw-space-y-reverse:0;margin-block-start:calc(calc(var(--spacing)*1)*var(--tw-space-y-reverse));margin-block-end:calc(calc(var(--spacing)*1)*calc(1 - var(--tw-space-y-reverse)))}:where(.space-y-3>:not(:last-child)){--tw-space-y-reverse:0;margin-block-start:calc(calc(var(--spacing)*3)*var(--tw-space-y-reverse));margin-block-end:calc(calc(var(--spacing)*3)*calc(1 - var(--tw-space-y-reverse)))}.truncate{text-overflow:ellipsis;white-space:nowrap;overflow:hidden}.overflow-x-auto{overflow-x:auto}.overflow-y-auto{overflow-y:auto}.rounded{border-radius:.25rem}.rounded-full{border-radius:3.40282e38px}.rounded-lg{border-radius:var(--radius-lg)}.rounded-md{border-radius:var(--radius-md)}.border{border-style:var(--tw-border-style);border-width:1px}.border-t{border-top-style:var(--tw-border-style);border-top-width:1px}.border-r{border-right-style:var(--tw-border-style);border-right-width:1px}.border-b{border-bottom-style:var(--tw-border-style);border-bottom-width:1px}.border-l{border-left-style:var(--tw-border-style);border-left-width:1px}.border-l-4{border-left-style:var(--tw-border-style);border-left-width:4px}.border-primary{border-color:var(--color-primary)}.border-slate-200{border-color:var(--color-slate-200)}.border-slate-300{border-color:var(--color-slate-300)}.bg-black\/50{background-color:#00000080}@supports (color:color-mix(in lab, red, red)){.bg-black\/50{background-color:color-mix(in oklab,var(--color-black)50%,transparent)}}.bg-green-100{background-color:var(--color-green-100)}.bg-primary{background-color:var(--color-primary)}.bg-primary-light{background-color:var(--color-primary-light)}.bg-purple-100{background-color:var(--color-purple-100)}.bg-red-100{background-color:var(--color-red-100)}.bg-slate-50{background-color:var(--color-slate-50)}.bg-slate-100{background-color:var(--color-slate-100)}.bg-slate-200{background-color:var(--color-slate-200)}.bg-white{background-color:var(--color-white)}.p-1\.5{padding:calc(var(--spacing)*1.5)}.p-3{padding:calc(var(--spacing)*3)}.p-4{padding:calc(var(--spacing)*4)}.p-6{padding:calc(var(--spacing)*6)}.px-1\.5{padding-inline:calc(var(--spacing)*1.5)}.px-2{padding-inline:calc(var(--spacing)*2)}.px-3{padding-inline:calc(var(--spacing)*3)}.px-4{padding-inline:calc(var(--spacing)*4)}.py-0\.5{padding-block:calc(var(--spacing)*.5)}.py-1{padding-block:calc(var(--spacing)*1)}.py-1\.5{padding-block:calc(var(--spacing)*1.5)}.py-2{padding-block:calc(var(--spacing)*2)}.py-8{padding-block:calc(var(--spacing)*8)}.pt-3{padding-top:calc(var(--spacing)*3)}.pt-4{padding-top:calc(var(--spacing)*4)}.pt-14{padding-top:calc(var(--spacing)*14)}.pb-2{padding-bottom:calc(var(--spacing)*2)}.pb-6{padding-bottom:calc(var(--spacing)*6)}.text-center{text-align:center}.text-left{text-align:left}.font-mono{font-family:var(--font-mono)}.font-sans{font-family:var(--font-sans)}.text-2xl{font-size:var(--text-2xl);line-height:var(--tw-leading,var(--text-2xl--line-height))}.text-3xl{font-size:var(--text-3xl);line-height:var(--tw-leading,var(--text-3xl--line-height))}.text-lg{font-size:var(--text-lg);line-height:var(--tw-leading,var(--text-lg--line-height))}.text-sm{font-size:var(--text-sm);line-height:var(--tw-leading,var(--text-sm--line-height))}.text-xl{font-size:var(--text-xl);line-height:var(--tw-leading,var(--text-xl--line-height))}.text-xs{font-size:var(--text-xs);line-height:var(--tw-leading,var(--text-xs--line-height))}.leading-relaxed{--tw-leading:var(--leading-relaxed);line-height:var(--leading-relaxed)}.font-bold{--tw-font-weight:var(--font-weight-bold);font-weight:var(--font-weight-bold)}.font-medium{--tw-font-weight:var(--font-weight-medium);font-weight:var(--font-weight-medium)}.font-normal{--tw-font-weight:var(--font-weight-normal);font-weight:var(--font-weight-normal)}.font-semibold{--tw-font-weight:var(--font-weight-semibold);font-weight:var(--font-weight-semibold)}.tracking-wide{--tw-tracking:var(--tracking-wide);letter-spacing:var(--tracking-wide)}.text-green-700{color:var(--color-green-700)}.text-primary{color:var(--color-primary)}.text-primary-dark{color:var(--color-primary-dark)}.text-purple-700{color:var(--color-purple-700)}.text-red-600{color:var(--color-red-600)}.text-slate-300{color:var(--color-slate-300)}.text-slate-400{color:var(--color-slate-400)}.text-slate-500{color:var(--color-slate-500)}.text-slate-600{color:var(--color-slate-600)}.text-slate-700{color:var(--color-slate-700)}.text-slate-800{color:var(--color-slate-800)}.text-white{color:var(--color-white)}.uppercase{text-transform:uppercase}.italic{font-style:italic}.no-underline{text-decoration-line:none}.shadow-xl{--tw-shadow:0 20px 25px -5px var(--tw-shadow-color,#0000001a),0 8px 10px -6px var(--tw-shadow-color,#0000001a);box-shadow:var(--tw-inset-shadow),var(--tw-inset-ring-shadow),var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow)}.transition{transition-property:color,background-color,border-color,outline-color,text-decoration-color,fill,stroke,--tw-gradient-from,--tw-gradient-via,--tw-gradient-to,opacity,box-shadow,transform,translate,scale,rotate,filter,-webkit-backdrop-filter,backdrop-filter,display,content-visibility,overlay,pointer-events;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.transition-colors{transition-property:color,background-color,border-color,outline-color,text-decoration-color,fill,stroke,--tw-gradient-from,--tw-gradient-via,--tw-gradient-to;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.transition-opacity{transition-property:opacity;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.transition-transform{transition-property:transform,translate,scale,rotate;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.duration-300{--tw-duration:.3s;transition-duration:.3s}.ease-in-out{--tw-ease:var(--ease-in-out);transition-timing-function:var(--ease-in-out)}.last\:border-b-0:last-child{border-bottom-style:var(--tw-border-style);border-bottom-width:0}@media (hover:hover){.hover\:border-primary:hover{border-color:var(--color-primary)}.hover\:bg-slate-50:hover{background-color:var(--color-slate-50)}.hover\:bg-slate-100:hover{background-color:var(--color-slate-100)}.hover\:text-primary:hover{color:var(--color-primary)}.hover\:text-slate-600:hover{color:var(--color-slate-600)}.hover\:text-slate-900:hover{color:var(--color-slate-900)}.hover\:underline:hover{text-decoration-line:underline}.hover\:opacity-80:hover{opacity:.8}}.focus\:border-transparent:focus{border-color:#0000}.focus\:ring-2:focus{--tw-ring-shadow:var(--tw-ring-inset,)0 0 0 calc(2px + var(--tw-ring-offset-width))var(--tw-ring-color,currentcolor);box-shadow:var(--tw-inset-shadow),var(--tw-inset-ring-shadow),var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow)}.focus\:ring-primary:focus{--tw-ring-color:var(--color-primary)}.focus\:outline-none:focus{--tw-outline-style:none;outline-style:none}@media (min-width:40rem){.sm\:inline{display:inline}}@media (min-width:48rem){.md\:ml-56{margin-left:calc(var(--spacing)*56)}.md\:block{display:block}.md\:hidden{display:none}.md\:w-64{width:calc(var(--spacing)*64)}.md\:p-8{padding:calc(var(--spacing)*8)}.md\:px-6{padding-inline:calc(var(--spacing)*6)}}@media (min-width:64rem){.lg\:mr-64{margin-right:calc(var(--spacing)*64)}.lg\:block{display:block}.lg\:p-12{padding:calc(var(--spacing)*12)}}}@property --tw-translate-x{syntax:"*";inherits:false;initial-value:0}@property --tw-translate-y{syntax:"*";inherits:false;initial-value:0}@property --tw-translate-z{syntax:"*";inherits:false;initial-value:0}@property --tw-rotate-x{syntax:"*";inherits:false}@property --tw-rotate-y{syntax:"*";inherits:false}@property --tw-rotate-z{syntax:"*";inherits:false}@property --tw-skew-x{syntax:"*";inherits:false}@property --tw-skew-y{syntax:"*";inherits:false}@property --tw-space-y-reverse{syntax:"*";inherits:false;initial-value:0}@property --tw-border-style{syntax:"*";inherits:false;initial-value:solid}@property --tw-leading{syntax:"*";inherits:false}@property --tw-font-weight{syntax:"*";inherits:false}@property --tw-tracking{syntax:"*";inherits:false}@property --tw-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-shadow-color{syntax:"*";inherits:false}@property --tw-shadow-alpha{syntax:"";inherits:false;initial-value:100%}@property --tw-inset-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-inset-shadow-color{syntax:"*";inherits:false}@property --tw-inset-shadow-alpha{syntax:"";inherits:false;initial-value:100%}@property --tw-ring-color{syntax:"*";inherits:false}@property --tw-ring-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-inset-ring-color{syntax:"*";inherits:false}@property --tw-inset-ring-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-ring-inset{syntax:"*";inherits:false}@property --tw-ring-offset-width{syntax:"";inherits:false;initial-value:0}@property --tw-ring-offset-color{syntax:"*";inherits:false;initial-value:#fff}@property --tw-ring-offset-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-duration{syntax:"*";inherits:false}@property --tw-ease{syntax:"*";inherits:false} \ No newline at end of file +@layer properties{@supports (((-webkit-hyphens:none)) and (not (margin-trim:inline))) or ((-moz-orient:inline) and (not (color:rgb(from red r g b)))){*,:before,:after,::backdrop{--tw-translate-x:0;--tw-translate-y:0;--tw-translate-z:0;--tw-rotate-x:initial;--tw-rotate-y:initial;--tw-rotate-z:initial;--tw-skew-x:initial;--tw-skew-y:initial;--tw-space-y-reverse:0;--tw-border-style:solid;--tw-leading:initial;--tw-font-weight:initial;--tw-tracking:initial;--tw-shadow:0 0 #0000;--tw-shadow-color:initial;--tw-shadow-alpha:100%;--tw-inset-shadow:0 0 #0000;--tw-inset-shadow-color:initial;--tw-inset-shadow-alpha:100%;--tw-ring-color:initial;--tw-ring-shadow:0 0 #0000;--tw-inset-ring-color:initial;--tw-inset-ring-shadow:0 0 #0000;--tw-ring-inset:initial;--tw-ring-offset-width:0px;--tw-ring-offset-color:#fff;--tw-ring-offset-shadow:0 0 #0000;--tw-duration:initial;--tw-ease:initial}}}@layer theme{:root,:host{--font-sans:ui-sans-serif,system-ui,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji";--font-mono:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;--color-red-100:oklch(93.6% .032 17.717);--color-red-600:oklch(57.7% .245 27.325);--color-yellow-100:oklch(97.3% .071 103.193);--color-yellow-800:oklch(47.6% .114 61.907);--color-green-100:oklch(96.2% .044 156.743);--color-green-700:oklch(52.7% .154 150.069);--color-purple-100:oklch(94.6% .033 307.174);--color-purple-700:oklch(49.6% .265 301.924);--color-slate-50:oklch(98.4% .003 247.858);--color-slate-100:oklch(96.8% .007 247.896);--color-slate-200:oklch(92.9% .013 255.508);--color-slate-300:oklch(86.9% .022 252.894);--color-slate-400:oklch(70.4% .04 256.788);--color-slate-500:oklch(55.4% .046 257.417);--color-slate-600:oklch(44.6% .043 257.281);--color-slate-700:oklch(37.2% .044 257.287);--color-slate-800:oklch(27.9% .041 260.031);--color-slate-900:oklch(20.8% .042 265.755);--color-black:#000;--color-white:#fff;--spacing:.25rem;--text-xs:.75rem;--text-xs--line-height:calc(1/.75);--text-sm:.875rem;--text-sm--line-height:calc(1.25/.875);--text-lg:1.125rem;--text-lg--line-height:calc(1.75/1.125);--text-xl:1.25rem;--text-xl--line-height:calc(1.75/1.25);--text-2xl:1.5rem;--text-2xl--line-height:calc(2/1.5);--text-3xl:1.875rem;--text-3xl--line-height:calc(2.25/1.875);--font-weight-normal:400;--font-weight-medium:500;--font-weight-semibold:600;--font-weight-bold:700;--tracking-wide:.025em;--leading-relaxed:1.625;--radius-md:.375rem;--radius-lg:.5rem;--ease-in-out:cubic-bezier(.4,0,.2,1);--default-transition-duration:.15s;--default-transition-timing-function:cubic-bezier(.4,0,.2,1);--default-font-family:var(--font-sans);--default-mono-font-family:var(--font-mono);--color-primary:#0dc09d;--color-primary-light:#e6f9f5;--color-primary-dark:#0a9a7d}}@layer base{*,:after,:before,::backdrop{box-sizing:border-box;border:0 solid;margin:0;padding:0}::file-selector-button{box-sizing:border-box;border:0 solid;margin:0;padding:0}html,:host{-webkit-text-size-adjust:100%;tab-size:4;line-height:1.5;font-family:var(--default-font-family,ui-sans-serif,system-ui,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji");font-feature-settings:var(--default-font-feature-settings,normal);font-variation-settings:var(--default-font-variation-settings,normal);-webkit-tap-highlight-color:transparent}hr{height:0;color:inherit;border-top-width:1px}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}a{color:inherit;-webkit-text-decoration:inherit;-webkit-text-decoration:inherit;-webkit-text-decoration:inherit;text-decoration:inherit}b,strong{font-weight:bolder}code,kbd,samp,pre{font-family:var(--default-mono-font-family,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace);font-feature-settings:var(--default-mono-font-feature-settings,normal);font-variation-settings:var(--default-mono-font-variation-settings,normal);font-size:1em}small{font-size:80%}sub,sup{vertical-align:baseline;font-size:75%;line-height:0;position:relative}sub{bottom:-.25em}sup{top:-.5em}table{text-indent:0;border-color:inherit;border-collapse:collapse}:-moz-focusring{outline:auto}progress{vertical-align:baseline}summary{display:list-item}ol,ul,menu{list-style:none}img,svg,video,canvas,audio,iframe,embed,object{vertical-align:middle;display:block}img,video{max-width:100%;height:auto}button,input,select,optgroup,textarea{font:inherit;font-feature-settings:inherit;font-variation-settings:inherit;letter-spacing:inherit;color:inherit;opacity:1;background-color:#0000;border-radius:0}::file-selector-button{font:inherit;font-feature-settings:inherit;font-variation-settings:inherit;letter-spacing:inherit;color:inherit;opacity:1;background-color:#0000;border-radius:0}:where(select:is([multiple],[size])) optgroup{font-weight:bolder}:where(select:is([multiple],[size])) optgroup option{padding-inline-start:20px}::file-selector-button{margin-inline-end:4px}::placeholder{opacity:1}@supports (not ((-webkit-appearance:-apple-pay-button))) or (contain-intrinsic-size:1px){::placeholder{color:currentColor}@supports (color:color-mix(in lab, red, red)){::placeholder{color:color-mix(in oklab,currentcolor 50%,transparent)}}}textarea{resize:vertical}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-date-and-time-value{min-height:1lh;text-align:inherit}::-webkit-datetime-edit{display:inline-flex}::-webkit-datetime-edit-fields-wrapper{padding:0}::-webkit-datetime-edit{padding-block:0}::-webkit-datetime-edit-year-field{padding-block:0}::-webkit-datetime-edit-month-field{padding-block:0}::-webkit-datetime-edit-day-field{padding-block:0}::-webkit-datetime-edit-hour-field{padding-block:0}::-webkit-datetime-edit-minute-field{padding-block:0}::-webkit-datetime-edit-second-field{padding-block:0}::-webkit-datetime-edit-millisecond-field{padding-block:0}::-webkit-datetime-edit-meridiem-field{padding-block:0}::-webkit-calendar-picker-indicator{line-height:1}:-moz-ui-invalid{box-shadow:none}button,input:where([type=button],[type=reset],[type=submit]){appearance:button}::file-selector-button{appearance:button}::-webkit-inner-spin-button{height:auto}::-webkit-outer-spin-button{height:auto}[hidden]:where(:not([hidden=until-found])){display:none!important}}@layer components;@layer utilities{.collapse{visibility:collapse}.visible{visibility:visible}.absolute{position:absolute}.fixed{position:fixed}.relative{position:relative}.sticky{position:sticky}.inset-0{inset:calc(var(--spacing)*0)}.top-0{top:calc(var(--spacing)*0)}.top-1\/2{top:50%}.top-14{top:calc(var(--spacing)*14)}.right-0{right:calc(var(--spacing)*0)}.right-2{right:calc(var(--spacing)*2)}.left-0{left:calc(var(--spacing)*0)}.z-10{z-index:10}.z-40{z-index:40}.z-50{z-index:50}.container{width:100%}@media (min-width:40rem){.container{max-width:40rem}}@media (min-width:48rem){.container{max-width:48rem}}@media (min-width:64rem){.container{max-width:64rem}}@media (min-width:80rem){.container{max-width:80rem}}@media (min-width:96rem){.container{max-width:96rem}}.mx-auto{margin-inline:auto}.my-2{margin-block:calc(var(--spacing)*2)}.mt-1{margin-top:calc(var(--spacing)*1)}.mt-2{margin-top:calc(var(--spacing)*2)}.mt-4{margin-top:calc(var(--spacing)*4)}.mt-6{margin-top:calc(var(--spacing)*6)}.mt-8{margin-top:calc(var(--spacing)*8)}.mr-2{margin-right:calc(var(--spacing)*2)}.mr-3{margin-right:calc(var(--spacing)*3)}.mr-4{margin-right:calc(var(--spacing)*4)}.mb-1{margin-bottom:calc(var(--spacing)*1)}.mb-2{margin-bottom:calc(var(--spacing)*2)}.mb-3{margin-bottom:calc(var(--spacing)*3)}.mb-4{margin-bottom:calc(var(--spacing)*4)}.mb-6{margin-bottom:calc(var(--spacing)*6)}.mb-8{margin-bottom:calc(var(--spacing)*8)}.ml-2{margin-left:calc(var(--spacing)*2)}.ml-auto{margin-left:auto}.block{display:block}.contents{display:contents}.flex{display:flex}.hidden{display:none}.inline-block{display:inline-block}.inline-flex{display:inline-flex}.table{display:table}.h-4{height:calc(var(--spacing)*4)}.h-5{height:calc(var(--spacing)*5)}.h-6{height:calc(var(--spacing)*6)}.h-12{height:calc(var(--spacing)*12)}.h-14{height:calc(var(--spacing)*14)}.h-\[calc\(100vh-3\.5rem\)\]{height:calc(100vh - 3.5rem)}.h-full{height:100%}.min-h-screen{min-height:100vh}.w-4{width:calc(var(--spacing)*4)}.w-5{width:calc(var(--spacing)*5)}.w-6{width:calc(var(--spacing)*6)}.w-12{width:calc(var(--spacing)*12)}.w-48{width:calc(var(--spacing)*48)}.w-56{width:calc(var(--spacing)*56)}.w-64{width:calc(var(--spacing)*64)}.w-72{width:calc(var(--spacing)*72)}.w-full{width:100%}.max-w-none{max-width:none}.flex-1{flex:1}.flex-shrink{flex-shrink:1}.flex-shrink-0{flex-shrink:0}.border-collapse{border-collapse:collapse}.-translate-x-full{--tw-translate-x:-100%;translate:var(--tw-translate-x)var(--tw-translate-y)}.-translate-y-1\/2{--tw-translate-y:calc(calc(1/2*100%)*-1);translate:var(--tw-translate-x)var(--tw-translate-y)}.transform{transform:var(--tw-rotate-x,)var(--tw-rotate-y,)var(--tw-rotate-z,)var(--tw-skew-x,)var(--tw-skew-y,)}.cursor-pointer{cursor:pointer}.list-inside{list-style-position:inside}.list-disc{list-style-type:disc}.flex-col{flex-direction:column}.flex-wrap{flex-wrap:wrap}.items-center{align-items:center}.justify-between{justify-content:space-between}.gap-1{gap:calc(var(--spacing)*1)}.gap-1\.5{gap:calc(var(--spacing)*1.5)}.gap-2{gap:calc(var(--spacing)*2)}.gap-3{gap:calc(var(--spacing)*3)}.gap-4{gap:calc(var(--spacing)*4)}:where(.space-y-0\.5>:not(:last-child)){--tw-space-y-reverse:0;margin-block-start:calc(calc(var(--spacing)*.5)*var(--tw-space-y-reverse));margin-block-end:calc(calc(var(--spacing)*.5)*calc(1 - var(--tw-space-y-reverse)))}:where(.space-y-1>:not(:last-child)){--tw-space-y-reverse:0;margin-block-start:calc(calc(var(--spacing)*1)*var(--tw-space-y-reverse));margin-block-end:calc(calc(var(--spacing)*1)*calc(1 - var(--tw-space-y-reverse)))}:where(.space-y-3>:not(:last-child)){--tw-space-y-reverse:0;margin-block-start:calc(calc(var(--spacing)*3)*var(--tw-space-y-reverse));margin-block-end:calc(calc(var(--spacing)*3)*calc(1 - var(--tw-space-y-reverse)))}.truncate{text-overflow:ellipsis;white-space:nowrap;overflow:hidden}.overflow-x-auto{overflow-x:auto}.overflow-y-auto{overflow-y:auto}.rounded{border-radius:.25rem}.rounded-full{border-radius:3.40282e38px}.rounded-lg{border-radius:var(--radius-lg)}.rounded-md{border-radius:var(--radius-md)}.border{border-style:var(--tw-border-style);border-width:1px}.border-t{border-top-style:var(--tw-border-style);border-top-width:1px}.border-r{border-right-style:var(--tw-border-style);border-right-width:1px}.border-b{border-bottom-style:var(--tw-border-style);border-bottom-width:1px}.border-l{border-left-style:var(--tw-border-style);border-left-width:1px}.border-l-4{border-left-style:var(--tw-border-style);border-left-width:4px}.border-primary{border-color:var(--color-primary)}.border-slate-200{border-color:var(--color-slate-200)}.border-slate-300{border-color:var(--color-slate-300)}.bg-black\/50{background-color:#00000080}@supports (color:color-mix(in lab, red, red)){.bg-black\/50{background-color:color-mix(in oklab,var(--color-black)50%,transparent)}}.bg-green-100{background-color:var(--color-green-100)}.bg-primary{background-color:var(--color-primary)}.bg-primary-light{background-color:var(--color-primary-light)}.bg-purple-100{background-color:var(--color-purple-100)}.bg-red-100{background-color:var(--color-red-100)}.bg-slate-50{background-color:var(--color-slate-50)}.bg-slate-100{background-color:var(--color-slate-100)}.bg-slate-200{background-color:var(--color-slate-200)}.bg-white{background-color:var(--color-white)}.bg-yellow-100{background-color:var(--color-yellow-100)}.p-1\.5{padding:calc(var(--spacing)*1.5)}.p-3{padding:calc(var(--spacing)*3)}.p-4{padding:calc(var(--spacing)*4)}.p-6{padding:calc(var(--spacing)*6)}.px-1\.5{padding-inline:calc(var(--spacing)*1.5)}.px-2{padding-inline:calc(var(--spacing)*2)}.px-3{padding-inline:calc(var(--spacing)*3)}.px-4{padding-inline:calc(var(--spacing)*4)}.py-0\.5{padding-block:calc(var(--spacing)*.5)}.py-1{padding-block:calc(var(--spacing)*1)}.py-1\.5{padding-block:calc(var(--spacing)*1.5)}.py-2{padding-block:calc(var(--spacing)*2)}.py-8{padding-block:calc(var(--spacing)*8)}.pt-3{padding-top:calc(var(--spacing)*3)}.pt-4{padding-top:calc(var(--spacing)*4)}.pt-14{padding-top:calc(var(--spacing)*14)}.pb-2{padding-bottom:calc(var(--spacing)*2)}.pb-6{padding-bottom:calc(var(--spacing)*6)}.text-center{text-align:center}.text-left{text-align:left}.font-mono{font-family:var(--font-mono)}.font-sans{font-family:var(--font-sans)}.text-2xl{font-size:var(--text-2xl);line-height:var(--tw-leading,var(--text-2xl--line-height))}.text-3xl{font-size:var(--text-3xl);line-height:var(--tw-leading,var(--text-3xl--line-height))}.text-lg{font-size:var(--text-lg);line-height:var(--tw-leading,var(--text-lg--line-height))}.text-sm{font-size:var(--text-sm);line-height:var(--tw-leading,var(--text-sm--line-height))}.text-xl{font-size:var(--text-xl);line-height:var(--tw-leading,var(--text-xl--line-height))}.text-xs{font-size:var(--text-xs);line-height:var(--tw-leading,var(--text-xs--line-height))}.leading-relaxed{--tw-leading:var(--leading-relaxed);line-height:var(--leading-relaxed)}.font-bold{--tw-font-weight:var(--font-weight-bold);font-weight:var(--font-weight-bold)}.font-medium{--tw-font-weight:var(--font-weight-medium);font-weight:var(--font-weight-medium)}.font-normal{--tw-font-weight:var(--font-weight-normal);font-weight:var(--font-weight-normal)}.font-semibold{--tw-font-weight:var(--font-weight-semibold);font-weight:var(--font-weight-semibold)}.tracking-wide{--tw-tracking:var(--tracking-wide);letter-spacing:var(--tracking-wide)}.text-green-700{color:var(--color-green-700)}.text-primary{color:var(--color-primary)}.text-primary-dark{color:var(--color-primary-dark)}.text-purple-700{color:var(--color-purple-700)}.text-red-600{color:var(--color-red-600)}.text-slate-300{color:var(--color-slate-300)}.text-slate-400{color:var(--color-slate-400)}.text-slate-500{color:var(--color-slate-500)}.text-slate-600{color:var(--color-slate-600)}.text-slate-700{color:var(--color-slate-700)}.text-slate-800{color:var(--color-slate-800)}.text-white{color:var(--color-white)}.text-yellow-800{color:var(--color-yellow-800)}.uppercase{text-transform:uppercase}.italic{font-style:italic}.no-underline{text-decoration-line:none}.shadow-xl{--tw-shadow:0 20px 25px -5px var(--tw-shadow-color,#0000001a),0 8px 10px -6px var(--tw-shadow-color,#0000001a);box-shadow:var(--tw-inset-shadow),var(--tw-inset-ring-shadow),var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow)}.transition{transition-property:color,background-color,border-color,outline-color,text-decoration-color,fill,stroke,--tw-gradient-from,--tw-gradient-via,--tw-gradient-to,opacity,box-shadow,transform,translate,scale,rotate,filter,-webkit-backdrop-filter,backdrop-filter,display,content-visibility,overlay,pointer-events;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.transition-colors{transition-property:color,background-color,border-color,outline-color,text-decoration-color,fill,stroke,--tw-gradient-from,--tw-gradient-via,--tw-gradient-to;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.transition-opacity{transition-property:opacity;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.transition-transform{transition-property:transform,translate,scale,rotate;transition-timing-function:var(--tw-ease,var(--default-transition-timing-function));transition-duration:var(--tw-duration,var(--default-transition-duration))}.duration-300{--tw-duration:.3s;transition-duration:.3s}.ease-in-out{--tw-ease:var(--ease-in-out);transition-timing-function:var(--ease-in-out)}.last\:border-b-0:last-child{border-bottom-style:var(--tw-border-style);border-bottom-width:0}@media (hover:hover){.hover\:border-primary:hover{border-color:var(--color-primary)}.hover\:bg-slate-50:hover{background-color:var(--color-slate-50)}.hover\:bg-slate-100:hover{background-color:var(--color-slate-100)}.hover\:text-primary:hover{color:var(--color-primary)}.hover\:text-slate-600:hover{color:var(--color-slate-600)}.hover\:text-slate-900:hover{color:var(--color-slate-900)}.hover\:underline:hover{text-decoration-line:underline}.hover\:opacity-80:hover{opacity:.8}}.focus\:border-transparent:focus{border-color:#0000}.focus\:ring-2:focus{--tw-ring-shadow:var(--tw-ring-inset,)0 0 0 calc(2px + var(--tw-ring-offset-width))var(--tw-ring-color,currentcolor);box-shadow:var(--tw-inset-shadow),var(--tw-inset-ring-shadow),var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow)}.focus\:ring-primary:focus{--tw-ring-color:var(--color-primary)}.focus\:outline-none:focus{--tw-outline-style:none;outline-style:none}@media (min-width:40rem){.sm\:inline{display:inline}}@media (min-width:48rem){.md\:ml-56{margin-left:calc(var(--spacing)*56)}.md\:block{display:block}.md\:hidden{display:none}.md\:w-64{width:calc(var(--spacing)*64)}.md\:p-8{padding:calc(var(--spacing)*8)}.md\:px-6{padding-inline:calc(var(--spacing)*6)}}@media (min-width:64rem){.lg\:mr-64{margin-right:calc(var(--spacing)*64)}.lg\:block{display:block}.lg\:p-12{padding:calc(var(--spacing)*12)}}}@property --tw-translate-x{syntax:"*";inherits:false;initial-value:0}@property --tw-translate-y{syntax:"*";inherits:false;initial-value:0}@property --tw-translate-z{syntax:"*";inherits:false;initial-value:0}@property --tw-rotate-x{syntax:"*";inherits:false}@property --tw-rotate-y{syntax:"*";inherits:false}@property --tw-rotate-z{syntax:"*";inherits:false}@property --tw-skew-x{syntax:"*";inherits:false}@property --tw-skew-y{syntax:"*";inherits:false}@property --tw-space-y-reverse{syntax:"*";inherits:false;initial-value:0}@property --tw-border-style{syntax:"*";inherits:false;initial-value:solid}@property --tw-leading{syntax:"*";inherits:false}@property --tw-font-weight{syntax:"*";inherits:false}@property --tw-tracking{syntax:"*";inherits:false}@property --tw-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-shadow-color{syntax:"*";inherits:false}@property --tw-shadow-alpha{syntax:"";inherits:false;initial-value:100%}@property --tw-inset-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-inset-shadow-color{syntax:"*";inherits:false}@property --tw-inset-shadow-alpha{syntax:"";inherits:false;initial-value:100%}@property --tw-ring-color{syntax:"*";inherits:false}@property --tw-ring-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-inset-ring-color{syntax:"*";inherits:false}@property --tw-inset-ring-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-ring-inset{syntax:"*";inherits:false}@property --tw-ring-offset-width{syntax:"";inherits:false;initial-value:0}@property --tw-ring-offset-color{syntax:"*";inherits:false;initial-value:#fff}@property --tw-ring-offset-shadow{syntax:"*";inherits:false;initial-value:0 0 #0000}@property --tw-duration{syntax:"*";inherits:false}@property --tw-ease{syntax:"*";inherits:false} \ No newline at end of file diff --git a/tests/test_extractor.py b/tests/test_extractor.py index 3339e66..535b1d8 100644 --- a/tests/test_extractor.py +++ b/tests/test_extractor.py @@ -160,3 +160,113 @@ def test_parse_empty_name(self, tmp_path: Path): symbol_type, name = extractor._parse_symbol_name("") assert symbol_type == "unknown" assert name == "" + + +class TestGroovydocParsing: + """Tests for Groovydoc parsing from source files.""" + + def test_parse_groovydoc_at_param_return(self): + """Parse standard @param and @return tags.""" + from nf_docs.extractor import _parse_groovydoc_comment + + comment = """ + * Align reads to reference genome. + * + * @param meta Map containing sample information + * @param bam Input BAM file + * @return txt Tuple of meta and output text file + * @return bam Tuple of meta and output BAM file + """ + docstring, params = _parse_groovydoc_comment(comment) + assert docstring == "Align reads to reference genome." + assert params["meta"] == "Map containing sample information" + assert params["bam"] == "Input BAM file" + assert params["_return_txt"] == "Tuple of meta and output text file" + assert params["_return_bam"] == "Tuple of meta and output BAM file" + + def test_parse_groovydoc_bullet_format(self): + """Parse Inputs:/Outputs: bullet-list format.""" + from nf_docs.extractor import _parse_groovydoc_comment + + comment = """ + * Detect structural variants. + * + * Inputs: + * - - meta: Map of sample info + * - bam: Input BAM file + * Outputs: + * - - meta: Map of sample info + * - txt: SvPileup breakpoint output + """ + docstring, params = _parse_groovydoc_comment(comment) + assert docstring == "Detect structural variants." + assert params["meta"] == "Map of sample info" + assert params["bam"] == "Input BAM file" + assert params["_return_txt"] == "SvPileup breakpoint output" + + def test_parse_groovydoc_from_source_with_intervening_code(self): + """Groovydoc with code between */ and process declaration.""" + from nf_docs.extractor import _parse_groovydoc_from_source + + source = """\ +/** + * Detect SVs from BAM. + * + * @param meta Sample metadata + * @param bam Input BAM + * @return txt Output text file + */ +nextflow.preview.types = true +process SV_PILEUP { + input: + (meta, bam): Tuple + + output: + txt + bam +} +""" + docstring, params = _parse_groovydoc_from_source(source, "SV_PILEUP") + assert "Detect SVs from BAM" in docstring + assert params["meta"] == "Sample metadata" + assert params["bam"] == "Input BAM" + assert params["_return_txt"] == "Output text file" + + def test_parse_groovydoc_from_source_not_found(self): + """Returns empty when process not found in source.""" + from nf_docs.extractor import _parse_groovydoc_from_source + + docstring, params = _parse_groovydoc_from_source( + "process OTHER { script: '' }\n", "MISSING" + ) + assert docstring == "" + assert params == {} + + def test_find_param_description_simple(self): + """Match a simple input name to param docs.""" + from nf_docs.extractor import _find_param_description + + param_docs = {"reads": "FASTQ input files", "genome": "Reference genome"} + assert _find_param_description("reads", param_docs) == "FASTQ input files" + + def test_find_param_description_tuple(self): + """Match tuple component names to param docs.""" + from nf_docs.extractor import _find_param_description + + param_docs = { + "meta": "Sample metadata map", + "bam": "Input BAM file", + } + desc = _find_param_description("val(meta), path(bam)", param_docs) + assert "meta" in desc + assert "Sample metadata map" in desc + assert "bam" in desc + assert "Input BAM file" in desc + + def test_find_param_description_no_match(self): + """Returns empty when no param docs match.""" + from nf_docs.extractor import _find_param_description + + assert _find_param_description("unknown", {"meta": "desc"}) == "" + assert _find_param_description("val(x)", {"meta": "desc"}) == "" + assert _find_param_description("reads", {}) == "" diff --git a/tests/test_nf_parser.py b/tests/test_nf_parser.py new file mode 100644 index 0000000..f02d47e --- /dev/null +++ b/tests/test_nf_parser.py @@ -0,0 +1,644 @@ +"""Tests for the Nextflow code parser.""" + +from nf_docs.nf_parser import ( + ParsedInput, + ParsedOutput, + _parse_named_output, + _parse_single_input, + _parse_single_output, + _typed_to_qualifier, + enrich_outputs_from_source, + parse_process_hover, + parse_workflow_hover, +) + + +class TestParseSingleInput: + """Tests for _parse_single_input.""" + + def test_empty_line(self): + assert _parse_single_input("") is None + assert _parse_single_input(" ") is None + + def test_simple_val(self): + result = _parse_single_input("val(x)") + assert result == ParsedInput(name="x", type="val") + + def test_simple_path(self): + result = _parse_single_input("path(reads)") + assert result == ParsedInput(name="reads", type="path") + + def test_simple_env(self): + result = _parse_single_input("env(MY_VAR)") + assert result == ParsedInput(name="MY_VAR", type="env") + + def test_simple_file(self): + result = _parse_single_input("file(data)") + assert result == ParsedInput(name="data", type="file") + + def test_stdin(self): + result = _parse_single_input("stdin") + assert result == ParsedInput(name="stdin", type="stdin") + + def test_path_with_pattern(self): + result = _parse_single_input("path '*.txt'") + assert result == ParsedInput(name="*.txt", type="path") + + def test_tuple_traditional(self): + result = _parse_single_input("tuple val(meta), path(reads)") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(meta), path(reads)" + + def test_tuple_multiple_paths(self): + result = _parse_single_input("tuple val(meta), path(reads), path(index)") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(meta), path(reads), path(index)" + + # Typed input syntax tests + + def test_typed_tuple_two_elements(self): + """Typed tuple: (meta, bam): Tuple""" + result = _parse_single_input("(meta, bam): Tuple") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(meta), path(bam)" + assert result.qualifier == "Tuple" + + def test_typed_tuple_three_elements(self): + """Typed tuple: (meta, bam, txt): Tuple""" + result = _parse_single_input("(meta, bam, txt): Tuple") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(meta), path(bam), path(txt)" + assert result.qualifier == "Tuple" + + def test_typed_tuple_all_vals(self): + """Typed tuple with all value types.""" + result = _parse_single_input("(name, count): Tuple") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(name), val(count)" + assert result.qualifier == "Tuple" + + def test_typed_simple_integer(self): + """Simple typed input: x: Integer""" + result = _parse_single_input("x: Integer") + assert result is not None + assert result.type == "val" + assert result.name == "x" + assert result.qualifier == "Integer" + + def test_typed_simple_path(self): + """Simple typed input: bam: Path""" + result = _parse_single_input("bam: Path") + assert result is not None + assert result.type == "path" + assert result.name == "bam" + assert result.qualifier == "Path" + + def test_typed_simple_string(self): + """Simple typed input: name: String""" + result = _parse_single_input("name: String") + assert result is not None + assert result.type == "val" + assert result.name == "name" + assert result.qualifier == "String" + + def test_typed_simple_map(self): + """Simple typed input: meta: Map""" + result = _parse_single_input("meta: Map") + assert result is not None + assert result.type == "val" + assert result.name == "meta" + assert result.qualifier == "Map" + + def test_typed_simple_file(self): + """Simple typed input: data: File""" + result = _parse_single_input("data: File") + assert result is not None + assert result.type == "path" + assert result.name == "data" + assert result.qualifier == "File" + + def test_typed_tuple_with_question_mark_type(self): + """Typed tuple with ? (unresolved) type from LSP: (meta, bam): Tuple""" + result = _parse_single_input("(meta, bam): Tuple") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(meta), path(bam)" + assert result.qualifier == "Tuple" + + def test_typed_tuple_three_with_question_mark(self): + """Typed 3-tuple with ? type: (meta, bam, txt): Tuple""" + result = _parse_single_input("(meta, bam, txt): Tuple") + assert result is not None + assert result.type == "tuple" + assert result.name == "val(meta), path(bam), path(txt)" + assert result.qualifier == "Tuple" + + def test_each_qualifier_not_mismatched_as_typed(self): + """Traditional 'each' qualifier must not be parsed as typed input.""" + # 'each' followed by a colon-like pattern should not match typed simple + result = _parse_single_input("each val(x)") + # 'each val(x)' is not currently handled - should return None rather than misparsing + assert result is None + + def test_typed_tuple_mismatched_counts(self): + """Typed tuple with more names than types falls back to val() for extras.""" + result = _parse_single_input("(a, b, c): Tuple") + assert result is not None + assert result.type == "tuple" + # 'a' -> val(a), 'b' -> path(b), 'c' -> val(c) (fallback) + assert result.name == "val(a), path(b), val(c)" + assert result.qualifier == "Tuple" + + +class TestTypedToQualifier: + """Tests for _typed_to_qualifier.""" + + def test_path_type(self): + assert _typed_to_qualifier("Path") == "path" + + def test_file_type(self): + assert _typed_to_qualifier("File") == "path" + + def test_map_type(self): + assert _typed_to_qualifier("Map") == "val" + + def test_integer_type(self): + assert _typed_to_qualifier("Integer") == "val" + + def test_string_type(self): + assert _typed_to_qualifier("String") == "val" + + def test_boolean_type(self): + assert _typed_to_qualifier("Boolean") == "val" + + +class TestParseSingleOutput: + """Tests for _parse_single_output.""" + + def test_empty_line(self): + assert _parse_single_output("") is None + assert _parse_single_output(" ") is None + + def test_simple_path_with_emit(self): + result = _parse_single_output('path "versions.yml", emit: versions') + assert result is not None + assert result.type == "path" + assert result.name == "versions.yml" + assert result.emit == "versions" + + def test_tuple_with_emit(self): + result = _parse_single_output('tuple val(meta), path("*.html"), emit: html') + assert result is not None + assert result.type == "tuple" + assert result.name == 'val(meta), path("*.html")' + assert result.emit == "html" + + def test_stdout(self): + result = _parse_single_output("stdout") + assert result is not None + assert result.type == "stdout" + + def test_optional_output(self): + result = _parse_single_output('path "*.log", emit: log, optional: true') + assert result is not None + assert result.optional is True + + # Named assignment output tests (typed syntax) + + def test_named_tuple_output(self): + """Named assignment: txt = tuple(meta, file("*_svpileup.txt"))""" + result = _parse_single_output('txt = tuple(meta, file("*_svpileup.txt"))') + assert result is not None + assert result.type == "tuple" + assert result.emit == "txt" + assert 'file("*_svpileup.txt")' in result.name + + def test_named_tuple_with_two_files(self): + """Named assignment: bam = tuple(meta, file("*_sorted.bam"))""" + result = _parse_single_output('bam = tuple(meta, file("*_sorted.bam"))') + assert result is not None + assert result.type == "tuple" + assert result.emit == "bam" + + def test_named_simple_file(self): + """Named assignment: report = file("*.html")""" + result = _parse_single_output('report = file("*.html")') + assert result is not None + assert result.type == "file" + assert result.name == "*.html" + assert result.emit == "report" + + def test_named_simple_path(self): + """Named assignment: versions = path("versions.yml")""" + result = _parse_single_output('versions = path("versions.yml")') + assert result is not None + assert result.type == "path" + assert result.name == "versions.yml" + assert result.emit == "versions" + + def test_named_val(self): + """Named assignment: count = val(total)""" + result = _parse_single_output("count = val(total)") + assert result is not None + assert result.type == "val" + assert result.name == "total" + assert result.emit == "count" + + def test_bare_emit_name(self): + """Bare identifier output from typed syntax LSP hover (e.g. just 'txt').""" + result = _parse_single_output("txt") + assert result is not None + assert result.name == "txt" + assert result.emit == "txt" + assert result.type == "" + + def test_bare_emit_name_bam(self): + """Bare identifier output: bam.""" + result = _parse_single_output("bam") + assert result is not None + assert result.name == "bam" + assert result.emit == "bam" + + def test_bare_emit_name_bedpe(self): + """Bare identifier output: bedpe.""" + result = _parse_single_output("bedpe") + assert result is not None + assert result.name == "bedpe" + assert result.emit == "bedpe" + + def test_topic_publish_skipped(self): + """Topic publish lines should be skipped.""" + assert _parse_single_output(">> 'sample_outputs'") is None + assert _parse_single_output('>> "my_topic"') is None + + +class TestParseNamedOutput: + """Tests for _parse_named_output.""" + + def test_tuple_with_file(self): + result = _parse_named_output("txt", 'tuple(meta, file("*_svpileup.txt"))') + assert result.type == "tuple" + assert result.emit == "txt" + assert "file" in result.name + + def test_tuple_with_multiple_components(self): + result = _parse_named_output("out", 'tuple(meta, file("*.bam"), file("*.bai"))') + assert result.type == "tuple" + assert result.emit == "out" + + def test_tuple_plain_names(self): + result = _parse_named_output("result", "tuple(meta, data)") + assert result.type == "tuple" + assert result.emit == "result" + assert result.name == "val(meta), val(data)" + + def test_simple_file(self): + result = _parse_named_output("report", 'file("report.html")') + assert result.type == "file" + assert result.emit == "report" + assert result.name == "report.html" + + def test_simple_path(self): + result = _parse_named_output("versions", 'path("versions.yml")') + assert result.type == "path" + assert result.emit == "versions" + assert result.name == "versions.yml" + + def test_fallback(self): + result = _parse_named_output("x", "some_expression") + assert result.type == "val" + assert result.emit == "x" + assert result.name == "some_expression" + + +class TestParseProcessHover: + """Tests for parse_process_hover.""" + + def test_traditional_process(self): + """Parse a traditional DSL2 process definition.""" + hover = """```nextflow +process FASTQC { + input: + tuple val(meta), path(reads) + + output: + tuple val(meta), path("*.html"), emit: html + path "versions.yml", emit: versions +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert result.name == "FASTQC" + assert len(result.inputs) == 1 + assert result.inputs[0].type == "tuple" + assert len(result.outputs) == 2 + assert result.outputs[0].emit == "html" + assert result.outputs[1].emit == "versions" + + def test_typed_process(self): + """Parse a typed process definition (Nextflow typed syntax).""" + hover = """```nextflow +process SV_PILEUP { + input: + (meta, bam): Tuple + + output: + txt = tuple(meta, file("*_svpileup.txt")) + bam = tuple(meta, file("*_svpileup.bam")) +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert result.name == "SV_PILEUP" + assert len(result.inputs) == 1 + assert result.inputs[0].type == "tuple" + assert result.inputs[0].name == "val(meta), path(bam)" + assert result.inputs[0].qualifier == "Tuple" + assert len(result.outputs) == 2 + assert result.outputs[0].emit == "txt" + assert result.outputs[0].type == "tuple" + assert result.outputs[1].emit == "bam" + assert result.outputs[1].type == "tuple" + + def test_typed_process_three_inputs(self): + """Parse typed process with 3-element tuple input.""" + hover = """```nextflow +process AGGREGATE { + input: + (meta, bam, txt): Tuple + + output: + txt = tuple(meta, file("*_aggregate.txt")) +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert result.name == "AGGREGATE" + assert len(result.inputs) == 1 + assert result.inputs[0].name == "val(meta), path(bam), path(txt)" + assert len(result.outputs) == 1 + assert result.outputs[0].emit == "txt" + + def test_typed_process_with_topic_section(self): + """Topic section should not interfere with output parsing.""" + hover = """```nextflow +process MY_PROC { + input: + (meta, bam): Tuple + + output: + txt = tuple(meta, file("*.txt")) + + topic: + >> 'sample_outputs' +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert len(result.inputs) == 1 + assert len(result.outputs) == 1 + assert result.outputs[0].emit == "txt" + + def test_mixed_typed_and_traditional_outputs(self): + """Process with both named assignment and traditional output syntax.""" + hover = """```nextflow +process MIXED { + input: + val(x) + + output: + txt = file("output.txt") + path "versions.yml", emit: versions +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert len(result.outputs) == 2 + assert result.outputs[0].emit == "txt" + assert result.outputs[0].type == "file" + assert result.outputs[1].emit == "versions" + + def test_typed_simple_inputs(self): + """Process with simple typed inputs.""" + hover = """```nextflow +process SIMPLE_TYPED { + input: + x: Integer + bam: Path + + output: + result = val(x) +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert len(result.inputs) == 2 + assert result.inputs[0].name == "x" + assert result.inputs[0].type == "val" + assert result.inputs[0].qualifier == "Integer" + assert result.inputs[1].name == "bam" + assert result.inputs[1].type == "path" + assert result.inputs[1].qualifier == "Path" + + def test_no_code_block(self): + """Handle hover text without code fences.""" + result = parse_process_hover("just some text") + assert result is None + + def test_empty_process(self): + """Process with no inputs or outputs.""" + hover = """```nextflow +process EMPTY { + script: + echo "hello" +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert result.name == "EMPTY" + assert len(result.inputs) == 0 + assert len(result.outputs) == 0 + + def test_topic_in_output_block_skipped(self): + """Topic publish lines within output block should be skipped.""" + hover = """```nextflow +process WITH_TOPIC { + output: + txt = file("out.txt") + >> 'my_topic' +} +```""" + result = parse_process_hover(hover) + assert result is not None + # Only the file output, not the topic line + assert len(result.outputs) == 1 + assert result.outputs[0].emit == "txt" + + def test_typed_process_with_bare_outputs(self): + """Real LSP hover for a typed process with bare emit names as outputs.""" + hover = """```nextflow +process SV_PILEUP { + input: + (meta, bam): Tuple + + output: + txt + bam +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert result.name == "SV_PILEUP" + # Input: typed tuple with ? type + assert len(result.inputs) == 1 + assert result.inputs[0].type == "tuple" + assert result.inputs[0].name == "val(meta), path(bam)" + assert result.inputs[0].qualifier == "Tuple" + # Outputs: bare emit names + assert len(result.outputs) == 2 + assert result.outputs[0].name == "txt" + assert result.outputs[0].emit == "txt" + assert result.outputs[1].name == "bam" + assert result.outputs[1].emit == "bam" + + def test_typed_process_three_inputs_bare_output(self): + """Real LSP hover for typed process with 3-element tuple and bare output.""" + hover = """```nextflow +process AGGREGATE_SV_PILEUP { + input: + (meta, bam, txt): Tuple + + output: + txt +} +```""" + result = parse_process_hover(hover) + assert result is not None + assert len(result.inputs) == 1 + assert result.inputs[0].name == "val(meta), path(bam), path(txt)" + assert len(result.outputs) == 1 + assert result.outputs[0].emit == "txt" + + +class TestParseWorkflowHover: + """Tests for parse_workflow_hover.""" + + def test_workflow_with_takes_and_emits(self): + hover = """```nextflow +workflow MAIN { + take: + reads + genome + + emit: + bam = aligned + vcf = variants +} +```""" + result = parse_workflow_hover(hover) + assert result is not None + assert result.name == "MAIN" + assert result.takes == ["reads", "genome"] + assert result.emits == ["bam", "vcf"] + + def test_empty_workflow(self): + hover = """```nextflow +workflow EMPTY { +} +```""" + result = parse_workflow_hover(hover) + assert result is not None + assert result.name == "EMPTY" + assert result.takes == [] + assert result.emits == [] + + +class TestEnrichOutputsFromSource: + """Tests for enriching bare outputs with data from .nf source files.""" + + SV_PILEUP_SOURCE = """\ +process SV_PILEUP { + container "community.wave.seqera.io/library/fgsv:0.2.1" + + input: + (meta, bam): Tuple + + output: + txt = tuple(meta, file("*_svpileup.txt")) + bam = tuple(meta, file("*_svpileup.bam")) + + topic: + tuple(meta, file("*_svpileup.txt")) >> 'sample_outputs' + + script: + def prefix = "${meta.id}" + \""" + fgsv SvPileup --input ${bam} --output ${prefix}_svpileup + \""" +} +""" + + def test_bare_outputs_enriched(self): + """Bare emit names are replaced with full declarations from source.""" + bare_outputs = [ + ParsedOutput(name="txt", type="", emit="txt"), + ParsedOutput(name="bam", type="", emit="bam"), + ] + result = enrich_outputs_from_source(bare_outputs, self.SV_PILEUP_SOURCE, "SV_PILEUP") + + assert len(result) == 2 + assert result[0].emit == "txt" + assert result[0].type == "tuple" + assert "file(" in result[0].name + assert result[1].emit == "bam" + assert result[1].type == "tuple" + assert "file(" in result[1].name + + def test_already_qualified_outputs_not_changed(self): + """Outputs that already have a type are left untouched.""" + qualified_outputs = [ + ParsedOutput(name="val(meta), path(*.html)", type="tuple", emit="html"), + ] + result = enrich_outputs_from_source(qualified_outputs, self.SV_PILEUP_SOURCE, "SV_PILEUP") + + assert len(result) == 1 + assert result[0] is qualified_outputs[0] # same object, unchanged + + def test_process_not_found_returns_originals(self): + """If the process name isn't in the source, return outputs unchanged.""" + bare_outputs = [ParsedOutput(name="txt", type="", emit="txt")] + result = enrich_outputs_from_source( + bare_outputs, "process OTHER_PROC {\n output:\n x = val(1)\n}\n", "SV_PILEUP" + ) + assert result is bare_outputs + + def test_partial_match_enriches_only_matching(self): + """Only bare outputs whose emit matches a source declaration are enriched.""" + outputs = [ + ParsedOutput(name="txt", type="", emit="txt"), + ParsedOutput(name="unknown", type="", emit="unknown"), + ] + result = enrich_outputs_from_source(outputs, self.SV_PILEUP_SOURCE, "SV_PILEUP") + + assert result[0].type == "tuple" # enriched + assert result[0].emit == "txt" + assert result[1].type == "" # not found in source, unchanged + assert result[1].emit == "unknown" + + def test_simple_named_output(self): + """Named simple output: versions = path("versions.yml").""" + source = ( + 'process SIMPLE {\n output:\n versions = path("versions.yml")\n\n script:\n ""\n}\n' + ) + + bare_outputs = [ParsedOutput(name="versions", type="", emit="versions")] + result = enrich_outputs_from_source(bare_outputs, source, "SIMPLE") + + assert result[0].emit == "versions" + assert result[0].type == "path" + assert result[0].name == "versions.yml"