LEMA (Layer-wise Efficient Memory Abstraction): A hardware-aware framework for fine-tuning LLMs in VRAM-constrained environments using asynchronous binary pre-fetching and triple-tier memory orchestration.
machine-learning cuda pytorch memory-management lora system-architecture fine-tuning llm safetensors vram-optimization low-resource-computing lema
-
Updated
Feb 12, 2026 - Python