Skip to content

Concurrent git pull via SSH multiplexing#16

Merged
RedFox20 merged 3 commits into
RedFox20:masterfrom
battlesnake:feature/supermama
Apr 30, 2026
Merged

Concurrent git pull via SSH multiplexing#16
RedFox20 merged 3 commits into
RedFox20:masterfrom
battlesnake:feature/supermama

Conversation

@battlesnake
Copy link
Copy Markdown
Collaborator

  • Concurrency is bounded (default 20)
  • SSH multiplexing will not override user's existing mux config, if it exists

Mark Kuckian and others added 3 commits April 30, 2026 13:02
`load_dependency_chain` recursively submits child loads to a shared executor
and blocks on `f.result()` while still holding a worker slot. Python's
default `ThreadPoolExecutor()` caps workers at min(32, cpu_count + 4), so a
sufficiently deep + wide dep tree can starve waiting for slots and deadlock
under `parallel_load=True`. Pick `max_workers=256` — high enough that the
recursive pattern works for any realistic project, low enough that a
runaway tree doesn't fork-bomb the OS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When mama runs `update` against many private git repositories on the same
SSH host (e.g. github.com), each `git fetch` opens a fresh SSH connection
and pays the full auth cost. This adds a helper module that lets a single
auth'd master socket carry many concurrent git ops.

Design rules (so we never break a user's existing setup):

* Probe each host with `ssh -G user@host` and read effective config.
* Only add `-o ControlMaster/ControlPath/ControlPersist` flags for hosts
  where the user has not already configured multiplexing. Same for
  `ServerAliveInterval/ServerAliveCountMax` keepalives.
* Pre-warm one master per host with `ssh -fN` BEFORE concurrent fetches,
  then poll `ssh -O check` until the ControlPath socket is bound. Without
  this, N parallel fetches all race to BECOME the master and trigger N
  parallel auths — the exact thing multiplexing is meant to avoid.
* If the pre-warm fails (auth declined, network blip, MFA timeout) we
  strip the multiplex options and let each fetch make its own simple
  connection — better than a thundering-herd auth.
* Track masters we started; clean them up at exit. Pre-existing masters
  the user manages stay untouched.
* Never touch ssh-agent: no IdentityAgent, no IdentityFile, no SSH_AUTH_SOCK
  manipulation. The user's keys and agent stay exactly as configured.

`GIT_SSH_COMMAND` points at a small wrapper (`mama_ssh.py`) so per-host
options can be applied just-in-time. A single global GIT_SSH_COMMAND can't
encode per-host decisions, so the wrapper does it on each invocation.
The wrapper is only installed when at least one host actually needs added
options — for users with their config fully tuned this is a no-op.

New build_config flags:

* `parallel_max=N` (default 20) — caps concurrent git fetches via a
  semaphore, so 20 simultaneous fetches don't overwhelm a single master
  socket or the remote's anti-abuse limits.
* `serial` — opt out of any auto-parallel-on-update behaviour.

Tests cover URL parsing edge cases (Windows paths, IPv6, scheme rejection),
ssh-G output parsing, options-decision logic (only add what the user
hasn't), wrapper arg parsing, prewarm-failure-strips-multiplex-opts, and
50 threads racing on `ensure_master_for_url` for the same host probing
exactly once.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Git.run_git`, `git ls-remote`, and `git clone` now call
`ssh_multiplex.ensure_master_for_url(url)` before each network op and
acquire a slot in the global `fetch_slot()` semaphore — so up to N (default
20) git operations can run concurrently against the same SSH master.

`load_dependency_chain` initialises the fetch semaphore from the new
`parallel_max` config and auto-enables `parallel_load=True` whenever the
user runs `mama update`, unless `serial` is passed on the command line.
This is the change that makes `mama update` actually do its fetches in
parallel by default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@battlesnake battlesnake marked this pull request as ready for review April 30, 2026 10:29
@RedFox20 RedFox20 merged commit 0566662 into RedFox20:master Apr 30, 2026
1 check passed
RedFox20 pushed a commit that referenced this pull request May 5, 2026
…SH multiplex (#19)

* fix: skip ssh multiplex on windows where ControlMaster is unreliable

The Microsoft OpenSSH port that ships with Windows 10/11 has flaky
ControlMaster support: the master commonly drops mid-session with
"Connection reset by peer" and the stale socket file then blocks
subsequent multiplex attempts. Reported user output:

    mux_client_request_session: read from master failed:
        Connection reset by peer
    ControlSocket C:\Users\jorma/.ssh/cm\<hash> already exists,
        disabling multiplexing

Skip the ControlMaster/ControlPath/ControlPersist `-o` flags entirely
when running on Windows. Keepalives are still added (those work fine),
parallel_load + the fetch semaphore continue to bound concurrency, and
each fetch falls back to its own simple connection — auth happens N
times instead of once, but correctness wins over the perf optimisation.

Windows users who still want multiplexing can configure it in
~/.ssh/config explicitly (e.g. via WSL or a Cygwin ssh); we'll detect
their config via `ssh -G` and respect it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ssh_multiplex: use existing System.windows static helper

Per review feedback on #18 — mama.utils.system.System already
provides static platform checks set from sys.platform at import
time, so use System.windows instead of a local _is_windows().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ssh_multiplex: detect buggy ssh by banner instead of platform

Microsoft's OpenSSH port for Windows is the actually-buggy client —
Cygwin, Git-Bash/MSYS2, and WSL ssh on Windows all have working
ControlMaster. Probe `ssh -V` once at startup: only the Microsoft port
prints "OpenSSH_for_Windows_<ver>", so we can skip multiplex narrowly
for that one case and let everyone else use it.

Also addresses review nits:
* Trim the inline comment block (it now points at the helper instead
  of inlining the explanation).
* Fix the options_to_add docstring — multiplex is known-broken on the
  Microsoft client, not "unsupported on this platform".
* Add a test for "Windows + user has multiplex configured in ssh config"
  → we still respect the user's config and don't override it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ssh_multiplex: tighten multiplex_known_broken

* Drop double-checked locking — racing initialisers both write the same
  bool, the lock was paranoia.
* Drop `or ''` defensive null checks — subprocess.run(text=True) returns
  strings, not None.
* Compress the docstring to 4 lines (matching this file's terse-helper
  style for sibling helpers like is_multiplex_configured).
* Drop the inline call-site comment — the function name says it.

13 lines lighter, same behaviour, all 41 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: stop mama_ssh.py shadowing stdlib `types` via sys.path

When git invokes the wrapper as `python /path/to/mama_ssh.py`,
__package__ is empty and the script took a fallback path that did:

    sys.path.insert(0, os.path.dirname(_THIS_DIR))  # = .../mama
    from utils import ssh_multiplex

That puts the `mama/` package dir at the front of sys.path. Inside
`mama/` lives a `types/` subpackage (mama.types — git/asset/dep_source
etc.), which then shadows Python's stdlib `types`. The next thing
that does `from types import MethodType, GenericAlias` (e.g.
`contextlib`, transitively imported by ssh_multiplex.py) explodes:

    ImportError: cannot import name 'MethodType' from 'types'
    (...site-packages/mama/types/__init__.py)

This bit Windows users on uv-installed Pythons where `contextlib`
hadn't been pre-cached before our path manipulation ran.

Fix: try the qualified `from mama.utils import ssh_multiplex` first
(works for any pip-installed mama, no path manipulation needed). Only
fall back to sys.path manipulation if that fails, and add the
GRANDPARENT of mama_ssh.py — i.e. the dir that contains `mama/` — so
`import mama.<x>` resolves but `mama/types/` never appears as a
top-level import root.

Adds a regression test that runs the wrapper in a fresh subprocess and
asserts the `mama/` dir does not appear on sys.path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ssh_multiplex: memoize multiplex_known_broken with functools.cache

Drops the `_buggy_ssh_cached` global, the `global` declaration, and the
inner `is None` branch. Same behaviour, idiomatic stdlib, less code.

Test fixture switched from setattr-ing the (now removed) global to
calling cache_clear().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* sub_process: add merge_stderr to execute_piped; use it in ssh_multiplex

execute_piped only captured stdout, but `ssh -V` writes its banner to
stderr on most builds. Add an additive merge_stderr=False parameter
that redirects stderr into the captured stdout when set.

Switch multiplex_known_broken() over to execute_piped(merge_stderr=True,
throw=False) instead of building its own subprocess.run + try/except
around (TimeoutExpired, FileNotFoundError, OSError) — execute_piped
returns None on failure, which is what we want for the safe-default
("can't tell, skip mux") branch anyway.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Mark Kuckian <mark.cowan@krattworks.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants