Skip to content

Latest commit

 

History

History
32 lines (17 loc) · 4.57 KB

File metadata and controls

32 lines (17 loc) · 4.57 KB

About WorkflowHub

The practice of performing computational processes using workflows has taken hold in the life sciences. Like data, workflows should be FAIR, citable, have managed metadata profiles and be openly available for review and analytics.

WorkflowHub is a new FAIR workflow registry sponsored by the European RI Cluster EOSC-Life and the European Research Infrastructure ELIXIR. It is workflow management system agnostic: workflows may remain in their native repositories in their native forms.

As workflows are multi-component objects, including example and test data, they are packaged, registered, downloaded and exchanged as workflow centric Research Objects using the RO-Crate specification, making the Hub an implementation of FAIR Digital Object principles.

A schema.org based Bioschemas profile describes the metadata about a workflow and use of the Common Workflow Language is encouraged, providing a canonical description of the workflow itself. Popular workflow management systems such as Galaxy, Nextflow, and Snakemake are working with the Hub to seamlessly and automatically support object packaging, registration and exchange.

WorkflowHub provides features such as community spaces, collections, versioning and snapshots, and contributor credit. In addition to its own APIs, WorkflowHub supports community registry standards and services such as GA4GH TRS and ELIXIR-AAI authentication, and current work integrates with the LifeMonitor workflow testing service.

The WorkflowHub Club open community works together to continuously co-develop the Hub. Beta-released in Sept 2020, the Hub now holds nearly 100 workflows, including 36 curated COVID-19 workflows. It is a listed resource of the European COVID19 Data Portal.

Development

Created as part of the EOSC-Life WP2 Tools Collaboratory, WorkflowHub is under development.

See a complete list on the acknowledgement page.

Aims of the project include:

  • Evolvement of myExperiment that is workflow system agnostic, supports a repository of workflows in native and standardised form (e.g. CWL and the virtual aggregation of established tool, workflow and registries to support discovery over a fragmented ecosystem. The federated registry would support a common API to simplify access for tool developers.
  • Standardised workflow identifiers and metadata descriptions needed for workflow discovery, reuse, preservation, interoperability and monitoring and metadata harvesting using standard protocols. Workflows are usually multi-component (requiring links to test data, example runs, explanatory documentation, etc) and used in collections for scientific use cases. We plan to use the Research Object specification for packaging workflows, which has already been combined with CWL and is part of the BioComputeObject specification.
  • Workflow snapshot preservation, publishing, citation and monitoring, credit claiming and workflows part of the scholarly communication landscape partnering with platforms like DataCite and EOSC’s OpenAIRE and their Research Community Dashboards linking publications with workflows, associated datasets, software, etc.
  • The workflow registry is planned to be based on the SEEK platform using Common Workflow Language and Research Objects to glue in federated workflow and tool descriptions across the research infrastructures.

Mission Statement

WorkflowHub has a sustainability plan that ensures the availability of its contributions and metadata up to and beyond 2026. If and when it reaches its end of service then the published contributions and metadata will be archived as RO-Crates and made available through a public repository, such as Zenodo, Figshare or another appropriate resource at that time.