Skip to content
Laurens Borst edited this page Jun 3, 2026 · 7 revisions

Welcome to the analysis-service wiki!

Summary

The analysis service is a microservice relating to the ELSI project. ELSI is a website—a Software as a Service (SaaS) really—that stands in for much of what we have been doing in the LAAC team.

  • Create reproducible datasets with the help of DataLad and ChildProject
  • Run data pipelines including (1) ML models (2) ChildProject-native data conversions (3) ChildProject-native data transformations

ELSI was made in collaboration with AtolCD. Naturally, they made the frontend and backend of the web application, and other kinds of microservices that use it. Part of their work has also been to manage the datasets, which includes the use of ChildProject and DataLad for version control, data provenance tracking, and data transformations.

The core subdomain of this ecosystem implemented by us is the analysis service. This is the service that actually runs long-running jobs, including (at the time of writing)

  • The voice type classifier (VTC) ML model
  • The updated voice type classifier 2 (VTC2.0) ML model
  • The automatic linguistic unit count estimator (ALICE) ML model
  • The rule-based ChildProject acoustics pipeline
  • The Wav2Vec-based, fine-tuned speech maturity ("W2V2") ML model

Integration of the Analysis Service with ELSI

Because ELSI's services and the analysis service run on the same machine, they share a filesystem. This is crucial, because the outputs of the analysis service are dropped into a dedicated folder at $ECHOLALIA_FOLDER/outputs/, and retrieved accordingly by ELSI, put into the corresponding dataset at $DATASETS_FOLDER/{dataset_uid}/outputs/{task_uid} (of course a traditional network filesystem/mount points, or object storage like an S3 bucket, could also work for syncing these files).

Finally, ELSI manages the task domain as well. A task in this context encloses a model pass or other data transformation that needs to be run by the analysis service. Task objects in this domain can be updated by the analysis service through a RESTful web API.

A typical event storm would be this:

  1. The analysis service daemon requests all tasks. It finds a pending task for VTC and sends a RunTask command over the Redis VTC queue. It simultaneously updates the task through a POST request to the ELSI endpoint.
  2. A VTC worker subscribed to the VTC queue pops the task and starts running.
  3. The VTC worker completed, putting its outputs in the $DATASETS_DIR/{dataset_uid}/outputs/task_uid/ folder. Once finished it sends a CompleteTask command over a Redis queue.
  4. The daemon, listening in on the completion channel, pops and handles the command, sending a status update to ELSI via a POST request.
  5. ELSI receives and handles this request, knowing the outputs will be at $DATASETS_DIR/{dataset_uid}/outputs/task_uid/.

The Stack

  • Analysis Daemon: a Python-based Daemon that runs continuously, periodically requesting tasks from ELSI via the web API. Simultaneously, it listens to the Redis queue on another thread, where it forwards task completions/task failure statuses to ELSI via the web API.
  • Redis: an easy-to-use real-time data platform which can be used for a variety of tasks, including as a key-value store, pubsub and message queue.
  • Analysis-service-core: a Python package found here that has core utilities useful for quickly integrating your model into this ecosystem.
  • Worker: any worker, such as the VTC-worker, which runs in its own Kubernetes pod. These are long running Python packages continuously listening in on the Redis queue. Codebases are quite standardised, relying heavily on the framework laid out by the core package mentioned earlier.
  • Docker: used for containerizing applications/services and helps in deploying/testing a multi-container application (i.e., the analysis service).
  • Docker Swarm: lightweight container orchestrator, which handles allocation of resources and pod failure.

And for development...

  • Echolalia (ELSI) mock server: a Flask-based web server that mocks the real ELSI web server. Allows you to use a mock database for use in E2E tests.
  • Tests: a separate test service for running E2E tests of the whole ecosystem. At present they only run simple dummy models (like a word count model) to see that the event storm, outlined earlier, is working correctly. Model-specific E2E tests can be found in the model folders and are also used in continuous integration.

We also have lightweight, standard observability stack:

  • Promtail as a log collector, which ships them to Loki.
  • Loki as the log storage and querying engine.
  • Grafana that provides an interactive dashboard. It queries Loki for data.
  • Netdata for system and container metrics, such as CPU and GPU usage.

Clone this wiki locally