Experiment with improving the parallelization of the layout optimizer 

It may be possible to improve the parallelizability of `optimize_layout` without changing the results from the single-threaded case. This is done by precomputing a "graph" of dependencies between observations and spinning off threads if we observe a series of observations that don't depend on each other. 

1. Iterate across observations within each epoch as usual. At each observation, find its neighbors and also hit the RNG to get the negative samples. The combined set of neighbors and negative samples defines the "dependencies".
2. Check if the dependencies depend on any other queued observations in other threads.
    - If yes and it's just one thread, add it to the task queue for that thread.
    - If yes and it's multiple threads, close the task queues for all threads. Put this observation on hold.
    - If not, add it to the task queue of the thread with the shortest task queue.
3. If all task queues are closed, actually execute the compute by spinning up one thread per non-empty task queue.
4. Restart this process with the on-hold observation, or finish if no observation was held.

Hopefully, the task queues build up to a healthy size and we don't have to spin up too many threads. At around 10 us per thread and 500 epochs, we'd be aiming for ~100 start/stops to keep the thread overhead under 10 seconds.

Note that this also necessitates holding onto the identities of the negative samples. This shouldn't take a lot of memory, but it may be wise to set a buffer size (e.g., 10 million negative samples) that closes all task queues and triggers execution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with improving the parallelization of the layout optimizer #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Experiment with improving the parallelization of the layout optimizer #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions