Skip to content

feat(profiling): add memory theory comparison and Mosaic analysis#4

Merged
Gklajer merged 4 commits into
the-tuning-machine:mainfrom
Gklajer:feat/memory-theory-comparison
Mar 28, 2026
Merged

feat(profiling): add memory theory comparison and Mosaic analysis#4
Gklajer merged 4 commits into
the-tuning-machine:mainfrom
Gklajer:feat/memory-theory-comparison

Conversation

@Gklajer

@Gklajer Gklajer commented Mar 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR adds a reproducible workflow for comparing theoretical memory costs with measured runtime behavior for dense and frozen-LoRA linear layers.

It introduces a shared memory experiment module, wires that logic into the profiling script, and adds Mosaic-based visualization output plus report updates so the results can be inspected and documented end to end.

Changes

  • add stellatscale.memory_experiment for experiment configuration, theoretical summaries, tolerance checks, and comparison utilities
  • update the profiling workflow to generate theory-vs-measurement reports from the same experiment definitions
  • add Mosaic memory analysis and plotting support for the single-layer LoRA study
  • expose the new memory experiment surface through the package API
  • add tests covering the memory experiment logic
  • update the report and output layout to reflect the new profiling artifacts
  • update dependency lockfiles and tooling config needed by the profiling workflow

Why

The branch moves the memory analysis from one-off profiling code to a codified experiment pipeline. That makes the comparisons easier to reproduce, validate, and reuse in the report.

Validation

  • added automated tests for the memory experiment module
  • exercised the profiling/report generation workflow on the branch outputs

Copilot AI review requested due to automatic review settings March 22, 2026 21:11

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a reusable “memory experiment” module that formalizes theoretical GPU-memory accounting and explicit theory-vs-measurement comparisons, then refactors the LoRA profiling script to use it and emit a dedicated comparison report.

Changes:

  • Introduces stellatscale.memory_experiment with shared config, theoretical summaries, measured-summary parsing, and comparison report objects.
  • Adds a profiling script (scripts/lora_memory_analysis.py) that runs dense vs frozen-LoRA profiling, parses Mosaic output, and writes comparison + theory comparison reports.
  • Adds focused tests covering dense/frozen-LoRA accounting and comparison-report behavior; updates dependency groups/lockfile for Mosaic + profiling.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
uv.lock Adds Mosaic/profiling dependency resolutions and markers.
pyproject.toml Pins Mosaic source and defines mosaic / profiling dependency groups.
src/stellatscale/memory_experiment.py New reusable theory + measurement parsing + comparison-report module.
src/stellatscale/init.py Re-exports memory experiment public API at package top-level.
scripts/lora_memory_analysis.py New end-to-end profiling + Mosaic analysis + theory comparison report generator.
tests/test_memory_experiment.py New tests for accounting correctness, frozen optimizer scoping, and comparison output behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/lora_memory_analysis.py Outdated
Comment on lines +136 to +139
schedule_factory = cast("Any", schedule)
schedule_phase_key = "warm" + chr(117) + "p"
profile_schedule = schedule_factory(
wait=0, active=EXPERIMENT_CONFIG.steps, repeat=1, **{schedule_phase_key: 0}

Copilot AI Mar 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profiler schedule setup is intentionally obfuscated (schedule_phase_key = "warm" + chr(117) + "p") and schedule is cast to Any to accept that key. This makes the profiling behavior hard to audit and removes type safety. Prefer calling torch.profiler.schedule directly with the explicit warmup=0 argument (and remove the Any cast) so future readers can understand the schedule and static checks still apply.

Suggested change
schedule_factory = cast("Any", schedule)
schedule_phase_key = "warm" + chr(117) + "p"
profile_schedule = schedule_factory(
wait=0, active=EXPERIMENT_CONFIG.steps, repeat=1, **{schedule_phase_key: 0}
profile_schedule = schedule(
wait=0,
warmup=0,
active=EXPERIMENT_CONFIG.steps,
repeat=1,

Copilot uses AI. Check for mistakes.
Comment thread scripts/lora_memory_analysis.py Outdated
Comment on lines +153 to +155
for _ in range(EXPERIMENT_CONFIG.steps):
profiler.step()

Copilot AI Mar 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

profiler.step() is called at the start of each iteration. In typical torch.profiler.profile usage, step() should be called at the end of the iteration to delimit the just-recorded work. As written, the first step may be empty and the last forward/backward/optimizer block may never be closed/recorded, skewing traces and memory attribution.

Copilot uses AI. Check for mistakes.
@Gklajer Gklajer force-pushed the feat/memory-theory-comparison branch 2 times, most recently from 067b7d2 to c62e3e6 Compare March 26, 2026 08:52
@Gklajer Gklajer force-pushed the feat/memory-theory-comparison branch from fa71c54 to 97463f1 Compare March 28, 2026 13:39
@Gklajer Gklajer changed the title feat(profiling): codify memory theory comparisons feat(profiling): add memory theory comparison and Mosaic analysis Mar 28, 2026
@Gklajer Gklajer merged commit 6394bf3 into the-tuning-machine:main Mar 28, 2026
5 checks passed
@Gklajer Gklajer deleted the feat/memory-theory-comparison branch March 28, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants