[Fix] Intercept and align Inductor output strides across piecewise sub-graphs#29
Merged
cennn merged 1 commit intoApr 27, 2026
Merged
Conversation
…graphs Inductor may change output memory layout (e.g. mm padding, kernel fusion) during standalone_compile. When FakeTensor strides from sub-graph N flow into sub-graph N+1's compilation, mismatched strides cause assert_size_stride failures at runtime. - Add _intercept_inductor_output_strides to capture strides Inductor reports via set_tracing_context_output_strides before the TracingContext is destroyed. - Add _restride_outputs to update FakeTensor strides using as_strided (zero-copy view) so downstream sub-graphs compile with correct layouts. - Add test_stride_mismatch.py for non-contiguous view across piecewise boundary regression. - Add test_unbacked_symbol_guard.py for GuardOnDataDependentSymNode regression with mark_unbacked + view(-1).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🗂️ PR Category
Summary
Inductor may silently change output strides during piecewise sub-graph compilation (e.g. mm padding). These strides are lost when the per-subgraph
TracingContextis destroyed, causing downstreamassert_size_stridefailures at runtime.Fix: intercept Inductor's reported output strides before context teardown, then align FakeTensor strides via
as_strided(zero-copy) for correct downstream compilation.Changes
piecewise_compiler.py:_intercept_inductor_output_strides()captures strides fromset_tracing_context_output_stridesbeforeTracingContextteardown.magi_backend.py:_restride_outputs()applies captured strides to FakeTensors; skips symbolic dimensions to preserve dynamic-shape compatibility.test_stride_mismatch.py: regression test for non-contiguous view across piecewise boundary.test_unbacked_symbol_guard.py: regression test forGuardOnDataDependentSymNodewithmark_unbacked+view(-1).Test plan
test_stride_mismatch,test_unbacked_symbol_guard,test_piecewise_deferred_assert_scope)