-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Currently, as far as I can tell, relative paths in the import field of a PEP schema get resolved relative to the current working dir via the use of os.abspath in peppy (https://github.com/pepkit/peppy/blob/60523c51cc7243680797f681a9d49a1ecec3c6ca/peppy/utils.py#L154). However, my assumption as a developer using the PEPkit would be that resolution of relative paths in the import field are relative to the location of the PEP schema which imports upstream schemas (the "importing schema").
The current working dir being used for relative path resolution is an issue when the (pipleine) developer writing the PEP schema has no control over what the current working directory is in relation to the importing schema, as is the case for Snakemake PEP schemas. Here, the working directory is specified by the user running the Snakemake pipeline (through specifying the output dir), while the pipeline developer might like to specify a common base schema for all their workflows (e.g. via a git submodule relative to the pipeline's schema). In this case, Eido cannot correctly resolve the path to the imported base schema in the pipeline's PEP schema (except in the special case that the output directory the pipeline user specified happens to the same as the location of the pipeline's PEP schema, which is likely not the cases most of the time). Therefore, relative paths currently cannot be used in the import field of a PEP schema without also restricting the working directory at the time the schema is constructed.
As a solution, I would propose that when another schema is imported and the import field contains a relative path, that that path is simply appended to the path of the importing schema, if the current schema is identifiable by a path. Otherwise the current working directory may still be used, if applicable. All of this should be possible to implement in the read_schema function.
Example
Project structure:
- project_root
- workflow
- Snakefile
- schemas
- pep_validation.yaml
- common
- schemas
- common_pep_validation.yaml
Contents of workflow/schemas/pep_validation.yaml:
[...]
imports:
- "../../common/schemas/common_pep_validation.yaml"
[...]
Contents of workflow/Snakefile:
[...]
pepschema: "schemas/validation.yaml"
[...]
This will lead to an error that Eido cannot find [working dir]../../common/schemas/common_pep_validation.yaml whenever working_dir is not set to anything nested two levels below the workflow dir. In this example working dir is set via snakemake --directory working_dir, i.e., the Snakemake pipeline's output directory.
For interactive use, I have fleshed out the example and attached it:
import_relative_path_demo.tar.gz
Set up the env via conda with the following command
conda env create --prefix ./test_env --file environment.yamland run
conda run --prefix test_env/ snakemake -n --config pep_config=./pep/config.yamland it should complain about not finding the common config file, while
conda run -p test_env/ snakemake -n --config pep_config=../../pep/config.yaml --directory workflow/test_output/should succeed on account of the output dir being nested in the same way as the pipeline specific schema.