Skip to content

Use GPUArrays caching allocator in train!#2665

Merged
CarloLucibello merged 1 commit into
masterfrom
cl/caching-allocator
Mar 26, 2026
Merged

Use GPUArrays caching allocator in train!#2665
CarloLucibello merged 1 commit into
masterfrom
cl/caching-allocator

Conversation

@CarloLucibello
Copy link
Copy Markdown
Member

Summary

  • Wraps each train! iteration in GPUArrays.@cached so temporary GPU allocations (gradients, intermediate activations) are pooled and reused across steps rather than freed and reallocated each time
  • Creates one AllocCache per train! call and calls unsafe_free! after the loop for deterministic cleanup
  • Adds GPUArrays as a direct dependency with compat "11.2" (the version that introduced AllocCache)
  • On CPU this is a no-op — CPU arrays don't go through GPUArrays' allocation path

Closes #2636

Test plan

  • Existing train! tests pass on CPU
  • GPU training shows stable memory usage across iterations (no GC spikes)
  • DomainError on non-finite loss still works correctly

🤖 Generated with Claude Code

Use GPUArrays caching allocator in train!

Wraps each training iteration in `GPUArrays.@cached` so temporary GPU
allocations (gradients, activations) are pooled and reused across steps
instead of being freed and reallocated each iteration, reducing GC pressure.

Closes #2636

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cleanup

cleanup
@CarloLucibello CarloLucibello merged commit cec0db7 into master Mar 26, 2026
4 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use Caching Allocator in train!

1 participant