Skip to content

feat[venom]: loop invariant#4819

Draft
harkal wants to merge 9 commits intovyperlang:masterfrom
harkal:feat/venom/loop_invariant
Draft

feat[venom]: loop invariant#4819
harkal wants to merge 9 commits intovyperlang:masterfrom
harkal:feat/venom/loop_invariant

Conversation

@harkal
Copy link
Copy Markdown
Collaborator

@harkal harkal commented Jan 26, 2026

What I did

How I did it

How to verify it

Commit message

Commit message for the final, squashed PR. (Optional, but reviewers will appreciate it! Please see our commit message style guide for what we would ideally like to see in a commit message.)

Description for the changelog

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

Comment thread tests/unit/compiler/venom/test_loop_invariant_codesize.py Fixed
@harkal
Copy link
Copy Markdown
Collaborator Author

harkal commented Jan 26, 2026

@HodanPlodky this is my implementation I told you I will push.
The comparison against #4175 by Codex
I suggest we take the good ideas from each to make one final implementation. What do you think?

LICM Implementation Comparison Report                                                                                                  
                                                                                                                                         
  Overview                                                                                                                               
  ┌────────────────┬─────────────────────────────────────────┬───────────────────────────────────────┐                                   
  │     Aspect     │   feat/venom/loop_invariant (current)   │         loop_invariant (old)          │                                   
  ├────────────────┼─────────────────────────────────────────┼───────────────────────────────────────┤                                   
  │ File           │ loop_invariant_code_motion.py (169 LOC) │ loop_invariant_hosting.py (138 LOC)   │                                   
  ├────────────────┼─────────────────────────────────────────┼───────────────────────────────────────┤                                   
  │ Class          │ LoopInvariantCodeMotionPass             │ LoopInvariantHoisting                 │                                   
  ├────────────────┼─────────────────────────────────────────┼───────────────────────────────────────┤                                   
  │ Loop Detection │ Self-contained in pass                  │ Separate NaturalLoopDetectionAnalysis │                                   
  ├────────────────┼─────────────────────────────────────────┼───────────────────────────────────────┤                                   
  │ Status         │ Integrated in O2/O3/Os                  │ Not integrated in pipeline            │                                   
  └────────────────┴─────────────────────────────────────────┴───────────────────────────────────────┘                                   
  ---                                                                                                                                    
  Algorithm Comparison                                                                                                                   
                                                                                                                                         
  Loop Detection                                                                                                                         
  ┌─────────────────────────────────────────────┬─────────────────────────────────────────────┐                                          
  │               Current Branch                │                 Old Branch                  │                                          
  ├─────────────────────────────────────────────┼─────────────────────────────────────────────┤                                          
  │ Uses dominator tree for back-edge detection │ Uses DFS stack for back-edge detection      │                                          
  ├─────────────────────────────────────────────┼─────────────────────────────────────────────┤                                          
  │ dom.dominates(succ, bb) → back edge         │ succ in stack → back edge                   │                                          
  ├─────────────────────────────────────────────┼─────────────────────────────────────────────┤                                          
  │ Computes preheader explicitly               │ Uses cfg_in(header).first() as hoist target │                                          
  ├─────────────────────────────────────────────┼─────────────────────────────────────────────┤                                          
  │ Tracks multiple latches per loop            │ Implicit single latch assumption            │                                          
  └─────────────────────────────────────────────┴─────────────────────────────────────────────┘                                          
  Verdict: Current branch is more correct. Dominator-based detection is the canonical approach. Old branch's first() heuristic is        
  fragile.                                                                                                                               
                                                                                                                                         
  Invariant Detection                                                                                                                    
  ┌───────────────────────────────────────────────────────────┬───────────────────────────────────────────────────────────┐              
  │                      Current Branch                       │                        Old Branch                         │              
  ├───────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────┤              
  │ Requires read_effects == EMPTY AND write_effects == EMPTY │ Only checks (read_effects & loop_write_effects) != EMPTY  │              
  ├───────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────┤              
  │ Uses domination check for safety                          │ No domination check                                       │              
  ├───────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────┤              
  │ Fixed-point iteration to cascade dependencies             │ Per-instruction scan with separate _assign_dependencies() │              
  └───────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────┘              
  Verdict: Current branch is safer. Old branch's looser effect check could hoist memory reads unsafely.                                  
                                                                                                                                         
  Safety Checks                                                                                                                          
  ┌───────────────────────────────┬─────────────────────────────┬───────────────────────────────────────┐                                
  │             Check             │           Current           │                  Old                  │                                
  ├───────────────────────────────┼─────────────────────────────┼───────────────────────────────────────┤                                
  │ Dominates all latches         │ ✅ Yes                      │ ❌ No                                 │                                
  ├───────────────────────────────┼─────────────────────────────┼───────────────────────────────────────┤                                
  │ Pure computation (no effects) │ ✅ Strict (both read/write) │ ⚠️ Partial (only read vs loop writes) │                                
  ├───────────────────────────────┼─────────────────────────────┼───────────────────────────────────────┤                                
  │ Handles conditional paths     │ ✅ Via domination           │ ❌ May hoist incorrectly              │                                
  ├───────────────────────────────┼─────────────────────────────┼───────────────────────────────────────┤                                
  │ Handles nested loops          │ ✅ Inner-first ordering     │ ⚠️ Dict iteration order               │                                
  └───────────────────────────────┴─────────────────────────────┴───────────────────────────────────────┘                                
  ---                                                                                                                                    
  Unique Features                                                                                                                        
                                                                                                                                         
  Current Branch Only                                                                                                                    
                                                                                                                                         
  1. LoopInfo dataclass - Clean encapsulation of loop metadata                                                                           
  2. Preheader validation - Requires exactly one non-loop predecessor                                                                    
  3. Latch domination check - Prevents unsafe hoisting from conditional paths                                                            
  4. Inner-first loop ordering - Enables cascading hoists through nested loops                                                           
  5. DFG invalidation - Correct analysis cache management                                                                                
                                                                                                                                         
  Old Branch Only                                                                                                                        
                                                                                                                                         
  1. Separate NaturalLoopDetectionAnalysis - Reusable analysis module                                                                    
  2. _assign_dependencies() - Special handling for assign chains                                                                         
  3. returndatasize exclusion - Specific volatile instruction handling                                                                   
  4. Parametric tests - @pytest.mark.parametrize for depth/count combinations                                                            
                                                                                                                                         
  ---                                                                                                                                    
  Test Coverage                                                                                                                          
  ┌──────────────────────────────────────┬──────────────────────────────────────────────────┐                                            
  │            Current Branch            │                    Old Branch                    │                                            
  ├──────────────────────────────────────┼──────────────────────────────────────────────────┤                                            
  │ 3 focused unit tests                 │ 4 parametric tests (3×3 = 9 cases each)          │                                            
  ├──────────────────────────────────────┼──────────────────────────────────────────────────┤                                            
  │ Uses PrePostChecker helper           │ Uses assert_ctx_eq                               │                                            
  ├──────────────────────────────────────┼──────────────────────────────────────────────────┤                                            
  │ Tests: simple, variant, branch-local │ Tests: detection, simple, dependent, unhoistable │                                            
  ├──────────────────────────────────────┼──────────────────────────────────────────────────┤                                            
  │ + codesize regression test           │ No codesize test                                 │                                            
  └──────────────────────────────────────┴──────────────────────────────────────────────────┘                                            
  Verdict: Old branch has better test coverage via parametric testing across nested loop depths.                                         
                                                                                                                                         
  ---                                                                                                                                    
  Recommendations                                                                                                                        
                                                                                                                                         
  1. Base Implementation: Current Branch (feat/venom/loop_invariant)                                                                     
                                                                                                                                         
  The current branch is the better foundation because:                                                                                   
  - Correct dominator-based loop detection                                                                                               
  - Strict safety checks (both read/write effects must be empty)                                                                         
  - Proper latch domination check prevents unsafe hoisting                                                                               
  - Already integrated into O2/O3/Os pipeline                                                                                            
  - Cleaner code structure with LoopInfo dataclass                                                                                       
                                                                                                                                         
  2. Features to Merge from Old Branch                                                                                                   
                                                                                                                                         
  1. Separate NaturalLoopDetectionAnalysis                                                                                               
    - Extract loop detection into reusable analysis module                                                                               
    - Other passes (loop unrolling, strength reduction) could benefit                                                                    
    - Use current branch's dominator-based algorithm, not old DFS approach                                                               
  2. Parametric Test Structure                                                                                                           
    - Adopt @pytest.mark.parametrize(depth, count) pattern                                                                               
    - Much better coverage for nested loops                                                                                              
    - Add to current test file                                                                                                           
  3. returndatasize Exclusion (investigate)                                                                                              
    - Old branch explicitly excludes this opcode                                                                                         
    - Current branch relies on is_volatile - verify this covers it                                                                       
                                                                                                                                         
  3. Priority Order for Merging                                                                                                          
                                                                                                                                         
  1. High: Parametric tests from old branch → better coverage                                                                            
  2. Medium: Extract loop detection to separate analysis (architectural cleanup)                                                         
  3. Low: _assign_dependencies handling (current fixed-point handles this implicitly)                                                    
                                                                                                                                         
  ---                                                                                                                                    
  Summary                                                                                                                                
                                                                                                                                         
  Winner: Current branch (feat/venom/loop_invariant)                                                                                     
                                                                                                                                         
  The current implementation is more correct and safer. The old branch has some good ideas (reusable analysis, parametric tests) but has 
  algorithmic issues (no domination check, loose effect checking).                                                                       
                                                                                                                                         
  Merge the test patterns from the old branch into the current implementation, and consider extracting loop detection into a separate    
  analysis module for reuse.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 31, 2026

Codecov Report

❌ Patch coverage is 93.42105% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.99%. Comparing base (a7f7bf1) to head (e254472).

Files with missing lines Patch % Lines
vyper/venom/passes/loop_invariant_code_motion.py 93.37% 5 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4819      +/-   ##
==========================================
+ Coverage   91.97%   91.99%   +0.01%     
==========================================
  Files         187      188       +1     
  Lines       27241    27393     +152     
  Branches     4784     4828      +44     
==========================================
+ Hits        25055    25199     +144     
- Misses       1464     1467       +3     
- Partials      722      727       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

📊 Bytecode Size Changes (venom)

Contract legacy-O2 legacy-Os -O2 -O3 -Os
curvefi/amm/stableswap/meta_implementation/meta_implementation_v_700.vy 23610 22805 21074 (🔴+27) 20051 (🔴+27) 19600 (🔴+27)
curvefi/legacy/CurveStableSwapMetaNG.vy 24952 23578 20899 (🔴+100) 20280 (🔴+100) 19581 (🔴+100)
curvefi/amm/stableswap/implementation/implementation_v_700.vy 24962 23769 20598 (🔴+56) 19736 (🔴+56) 19176 (🔴+56)
curvefi/legacy/CurveStableSwapNG.vy 24473 23298 19973 (🔴+44) 19138 (🔴+44) 18607 (🔴+44)
curvefi/amm/tricryptoswap/implementation/implementation_v_200.vy 20590 19825 18318 (🔴+23) 17792 (🔴+23) 17231 (🔴+23)
curvefi/legacy/CurveCryptoSwap2.vy 18947 18382 17220 (🔴+1573) 16681 (🔴+1573) 16294 (🔴+1573)
yearnfi/VaultV3.vy 19972 19063 17109 (🔴+185) 15200 (🔴+185) 14536 (🔴+185)
curvefi/amm/stableswap/factory/factory_v_100.vy 14558 13978 13522 (🔴+24) 12126 (🔴+24) 12068 (🔴+24)
curvefi/amm/stableswap/views/views_v_120.vy 12784 12368 10574 (🔴+19) 9941 (🔴+19) 10073 (🔴+19)
curvefi/amm/tricryptoswap/math/math_v_200.vy 11055 10992 10506 (🔴+997) 9146 (🔴+997) 9382 (🔴+997)
curvefi/legacy/CurveCryptoMathOptimized3.vy 11054 10991 10505 (🔴+997) 9146 (🔴+997) 9382 (🔴+997)
curvefi/gauge/child_gauge/implementation/implementation_v_110.vy 12338 11561 10482 (🔴+11) 9988 (🔴+11) 9391 (🔴+11)
curvefi/gauge/child_gauge/implementation/implementation_v_100.vy 12017 11249 10188 (🔴+11) 9693 (🔴+11) 9107 (🔴+11)
curvefi/gauge/child_gauge/implementation/implementation_v_020.vy 10665 9947 9224 (🔴+11) 8730 (🔴+11) 8206 (🔴+11)
curvefi/amm/twocryptoswap/views/views_v_200.vy 6991 6946 7135 (🔴+625) 6860 (🔴+625) 6919 (🔴+625)
curvefi/amm/tricryptoswap/views/views_v_200.vy 7821 7776 7064 (🔴+10) 6780 (🔴+10) 6848 (🔴+10)
curvefi/registries/metaregistry/metaregistry_v_110.vy 7590 6732 7055 (🔴+27) 6198 (🔴+27) 6098 (🔴+27)
curvefi/helpers/stable_swap_meta_zap/stable_swap_meta_zap_v_100.vy 7302 7067 6788 (🔴+12) 6392 (🔴+12) 6490 (🔴+12)
curvefi/amm/twocryptoswap/math/math_v_210.vy 6666 6666 6327 (🔴+763) 5843 (🔴+763) 5848 (🔴+763)
curvefi/registries/metaregistry/registry_handlers/stableswap/handler_v_110.vy 6633 6259 6295 (🔴+18) 5460 (🔴+18) 5935 (🔴+18)
curvefi/amm/tricryptoswap/factory/factory_v_200.vy 5246 5021 5872 (🔴+4) 4903 (🔴+4) 4976 (🔴+4)
curvefi/amm/twocryptoswap/factory/factory_v_200.vy 5540 5252 5823 (🔴+4) 4591 (🔴+4) 4714 (🔴+4)
curvefi/gauge/child_gauge/factory/factory_v_201.vy 4844 4547 4286 (🔴+2) 4064 (🔴+2) 3827 (🔴+2)
curvefi/gauge/child_gauge/factory/factory_v_100.vy 4183 3914 3713 (🔴+2) 3477 (🔴+2) 3259 (🔴+2)
curvefi/helpers/rate_provider/rate_provider_v_101.vy 3260 3260 3537 (🔴+658) 3165 (🔴+658) 3178 (🔴+658)
curvefi/registries/address_provider/address_provider_v_201.vy 2973 2782 2785 (🟢-1) 2611 (🟢-1) 2463 (🟢-1)
curvefi/helpers/rate_provider/rate_provider_v_100.vy 2847 2841 2446 (🔴+1) 2109 (🔴+1) 2117 (🔴+1)
curvefi/helpers/deposit_and_stake_zap/deposit_and_stake_zap_v_100.vy 2322 2316 2178 (🔴+2) 1915 (🔴+2) 1946 (🔴+2)
curvefi/governance/relayer/taiko/relayer_v_001.vy 2068 2064 1712 (🔴+14) 1567 (🔴+14) 1597 (🔴+14)
curvefi/governance/relayer/polygon_cdk/relayer_v_101.vy 1556 1523 1504 (🔴+1) 1382 (🔴+1) 1380 (🔴+1)
curvefi/governance/relayer/arb_orbit/relayer_v_101.vy 1266 1262 1209 (🔴+1) 1113 (🔴+1) 1141 (🔴+1)
curvefi/governance/relayer/op_stack/relayer_v_101.vy 1186 1182 1142 (🔴+1) 1049 (🔴+1) 1074 (🔴+1)
curvefi/governance/relayer/not_rollup/relayer_v_100.vy 1168 1153 1131 (🔴+1) 1044 (🔴+1) 1053 (🔴+1)
curvefi/governance/agent/agent_v_100.vy 541 541 496 (🔴+5) 438 (🔴+5) 442 (🔴+5)
curvefi/governance/agent/agent_v_101.vy 541 541 496 (🔴+5) 438 (🔴+5) 442 (🔴+5)

Full bytecode sizes

Contract legacy-O2 legacy-Os -O2 -O3 -Os
curvefi/amm/stableswap/meta_implementation/meta_implementation_v_700.vy 23610 22805 21074 20051 19600
curvefi/legacy/CurveStableSwapMetaNG.vy 24952 23578 20899 20280 19581
curvefi/amm/stableswap/implementation/implementation_v_700.vy 24962 23769 20598 19736 19176
curvefi/legacy/CurveStableSwapNG.vy 24473 23298 19973 19138 18607
curvefi/amm/tricryptoswap/implementation/implementation_v_200.vy 20590 19825 18318 17792 17231
curvefi/legacy/CurveCryptoSwap2.vy 18947 18382 17220 16681 16294
yearnfi/VaultV3.vy 19972 19063 17109 15200 14536
curvefi/amm/twocryptoswap/implementation/implementation_v_210.vy 18090 17350 16513 15869 15382
yearnfi/VaultV2.vy 16676 15763 14201 13714 12879
curvefi/amm/stableswap/factory/factory_v_100.vy 14558 13978 13522 12126 12068
curvefi/amm/stableswap/views/views_v_120.vy 12784 12368 10574 9941 10073
curvefi/amm/tricryptoswap/math/math_v_200.vy 11055 10992 10506 9146 9382
curvefi/legacy/CurveCryptoMathOptimized3.vy 11054 10991 10505 9146 9382
curvefi/gauge/child_gauge/implementation/implementation_v_110.vy 12338 11561 10482 9988 9391
curvefi/gauge/child_gauge/implementation/implementation_v_100.vy 12017 11249 10188 9693 9107
curvefi/gauge/child_gauge/implementation/implementation_v_020.vy 10665 9947 9224 8730 8206
curvefi/amm/twocryptoswap/views/views_v_200.vy 6991 6946 7135 6860 6919
curvefi/helpers/router/router_v_110.vy 6717 6717 7124 6482 6698
curvefi/amm/tricryptoswap/views/views_v_200.vy 7821 7776 7064 6780 6848
curvefi/registries/metaregistry/metaregistry_v_110.vy 7590 6732 7055 6198 6098
curvefi/helpers/stable_swap_meta_zap/stable_swap_meta_zap_v_100.vy 7302 7067 6788 6392 6490
curvefi/amm/twocryptoswap/math/math_v_210.vy 6666 6666 6327 5843 5848
curvefi/registries/metaregistry/registry_handlers/stableswap/handler_v_110.vy 6633 6259 6295 5460 5935
curvefi/amm/tricryptoswap/factory/factory_v_200.vy 5246 5021 5872 4903 4976
curvefi/amm/twocryptoswap/factory/factory_v_200.vy 5540 5252 5823 4591 4714
curvefi/gauge/child_gauge/factory/factory_v_201.vy 4844 4547 4286 4064 3827
curvefi/registries/metaregistry/registry_handlers/tricryptoswap/handler_v_110.vy 4241 3939 4083 3754 3759
curvefi/registries/metaregistry/registry_handlers/twocryptoswap/handler_v_110.vy 4186 3884 4014 3616 3637
yearnfi/VaultFactory.vy 3765 3617 3856 2431 2835
curvefi/gauge/child_gauge/factory/factory_v_100.vy 4183 3914 3713 3477 3259
curvefi/helpers/rate_provider/rate_provider_v_101.vy 3260 3260 3537 3165 3178
curvefi/registries/address_provider/address_provider_v_201.vy 2973 2782 2785 2611 2463
curvefi/amm/stableswap/math/math_v_100.vy 3067 3046 2641 2416 2433
curvefi/helpers/rate_provider/rate_provider_v_100.vy 2847 2841 2446 2109 2117
curvefi/helpers/deposit_and_stake_zap/deposit_and_stake_zap_v_100.vy 2322 2316 2178 1915 1946
curvefi/governance/relayer/taiko/relayer_v_001.vy 2068 2064 1712 1567 1597
curvefi/governance/relayer/polygon_cdk/relayer_v_101.vy 1556 1523 1504 1382 1380
curvefi/governance/relayer/arb_orbit/relayer_v_101.vy 1266 1262 1209 1113 1141
curvefi/governance/relayer/op_stack/relayer_v_101.vy 1186 1182 1142 1049 1074
curvefi/governance/relayer/not_rollup/relayer_v_100.vy 1168 1153 1131 1044 1053
curvefi/governance/vault/vault_v_100.vy 964 941 948 899 894
curvefi/governance/agent/agent_v_100.vy 541 541 496 438 442
curvefi/governance/agent/agent_v_101.vy 541 541 496 438 442
curvefi/governance/relayer/relayer_v_100.vy 496 496 468 463 468

frederikgramkortegaard pushed a commit to frederikgramkortegaard/vyper that referenced this pull request Mar 31, 2026
Don't create preheaders in LICM - skip loops without valid preheaders
instead. This avoids CFG modification complexity that caused malformed
phi operands when processing multiple loops with stale cached analyses.

The approach now matches PR vyperlang#4819's simpler design: loops must already
have a valid preheader (single outside predecessor whose only successor
is the loop header) for LICM to apply.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants