Discussion: add HTTP API for `looper`

This issue is coauthored by @zz1874.

`looper` is a CLI tool that often runs on the front node of a HPC cluster, so jobs can be submitted to Slurm / SGE / other job schedulers.
@nsheff expressed desire for a HTTP API for `looper` which wraps around `looper`. That would allow him and other users to run `looper` on the front node and use a reverse SSH tunnel from a different machine to send HTTP requests to the HTTP API.
Advantages of this would be
- use of `looper` functionalities from any machine without manually copying code to the frontend node,
- potential for a graphical user interface (GUI) that builds upon that API.

An earlier attempt of this was `caravel` (https://github.com/pepkit/caravel). @nsheff tells us that there were issues, possibly due to the synchronous nature of the Flask framework. `caravel` seems to be a Python 2.7 code base that uses `2to3` to convert to Python 3 code on-the-fly during installation via `setuptools`' `use_2to3`. This makes it, in the meantime, hard to run `caravel` for reasons such as: `setuptools` doesn't come with `use_2to3` anymore, the Docker image cannot be built anymore, Debian index URLs are out of date, Python 3.6-specific typing imports are used, ...

After browsing the `looper` and `caravel` code, we identified the following possibilities:
1. Revive `caravel`, meaning bringing it up-to-date with recent Python versions and making it compatible with recent `looper` versions,
2. Write a new HTTP API from scratch, which leaves us at least three possibilities:
   1. Figure out a way of automagically creating both CLI and HTTP API from a single definition of commands / options.
      This would likely be the most sustainable idea, as it prevents the need of keeping CLI and HTTP API in sync if commands / options are added / removed in the future. But it would possibly be a larger undertaking with the risk of being only partially finished in the limited time we can work on it. It would also possibly make a nice separate, reusable library!
   2. Implement only the most important top-level commands and their options as HTTP API endpoints, but design this easily transferable to other commands and document the development process. That way, a subset of the `looper` commands / options could likely be made available via the HTTP API in the little development time we have. But this also means an increased maintenance burden - if a new CLI command / option is added, the HTTP API and its documentation have to be adapted accordingly.
   3. Implement only top-level commands and allow setting of flags / options only via a project configuration file that is `POST`ed to the API. This would be the easiest and quickest solution, but limits the use cases of the API. A similarly easy and inflexible approach would be `POST`ing a string of command lines argument that is then parsed by `looper`'s existing `argparse` argument parser.

Important questions that would need to be answered:
- [x] Which version of `looper` should we develop against? `looper` is currently at v1.5.1, but there is a PR open for v.1.6.0, and in fact we could only get the `hello_looper` example working with the future v1.6.0 of `looper`. A similar question holds for `pipestat`, if required for development of the HTTP API. The answer is: v.1.6.0 for `looper` and v0.6.0 for `pipestat`, as both new versions have now been released.
- [x] What were the exact issues you faced with `caravel`? Knowing them would help us make a more informed decision whether to possibly revive `caravel` or to redevelop from scratch, avoiding mistakes made in `caravel`. Answer: https://github.com/pepkit/looper/issues/433#issuecomment-1877218543
- [x] If we were to decide to implement only a subset of the top-level `looper` commands: which commands have the highest priority and should thus be implemented first as HTTP API calls? Answer: `looper run`, `looper runp`, `looper check`, `looper report` (https://github.com/pepkit/looper/issues/433#issuecomment-1877218543)

And finally, of course:
- [x] Which of options 1-3.1-3 should we pursue? We should discuss this question together with @nsheff and add the answer in a comment. Answer: in a call with @nsheff, we decided to go with 2.1. Details in a comment below.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discussion: add HTTP API for `looper` #433

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: add HTTP API for looper #433

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Discussion: add HTTP API for `looper` #433