Implement generic dataflow analysis framework with gen-kill pattern and SetLattice#23
Conversation
- Created GenKillAnalysis abstract class for gen-kill pattern - Refactored ReachingDefinitionsAnalysis to use gen-kill framework - Refactored LiveVariablesAnalysis to use gen-kill framework - Implemented ConstantPropagationAnalysis with worklist solver - Added comprehensive tests for constant propagation - All existing tests still pass Co-authored-by: jpksh90 <8288548+jpksh90@users.noreply.github.com>
| */ | ||
| abstract class GenKillAnalysis<T>( | ||
| override val direction: Direction, | ||
| ) : DataflowAnalysis<T>() { |
There was a problem hiding this comment.
Create a lattice SetLattice and parameterize over SetLattice
There was a problem hiding this comment.
Created SetLattice class with difference, union, and meet operations. GenKillAnalysis now parameterized with SetLattice instead of abstract methods. (commit fd61463)
| class ReachingDefinitionsAnalysis : DataflowAnalysis<Set<String>>() { | ||
| override val direction = Direction.FORWARD | ||
|
|
||
| class ReachingDefinitionsAnalysis : GenKillAnalysis<Set<String>>(Direction.FORWARD) { |
There was a problem hiding this comment.
Updated ReachingDefinitionsAnalysis and LiveVariablesAnalysis to use SetLattice parameterization. Removed redundant method implementations. (commit fd61463)
|
@copilot Fix the comments. |
- Created SetLattice class with difference, union, and meet operations - Parameterized GenKillAnalysis with SetLattice<T> instead of abstract methods - Updated ReachingDefinitionsAnalysis to use SetLattice<String> - Updated LiveVariablesAnalysis to use SetLattice<String> - Removed redundant difference/union/meet method implementations - All tests pass successfully Co-authored-by: jpksh90 <8288548+jpksh90@users.noreply.github.com>
|
There was a problem hiding this comment.
Pull request overview
This PR implements a generic dataflow analysis framework with a gen-kill pattern and introduces constant propagation analysis. The framework includes a reusable SetLattice class and GenKillAnalysis base class that simplifies implementing set-based dataflow analyses. The existing reaching definitions and live variables analyses are refactored to use this framework, and a new constant propagation analysis is added.
Key changes:
- Generic gen-kill framework that abstracts the common
OUT = (IN - kill) ∪ genpattern for both forward and backward analyses - SetLattice utility class providing reusable set operations (difference, union, meet)
- Constant propagation analysis with a three-level lattice (Top, Constant, Bottom) supporting arithmetic, comparison, and logical operators
- Refactored reaching definitions and live variables analyses to extend the gen-kill framework
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/main/kotlin/slang/hlir/DataflowAnalysis.kt | Adds SetLattice, GenKillAnalysis framework, refactors ReachingDefinitionsAnalysis and LiveVariablesAnalysis to use gen-kill pattern, implements ConstantPropagationAnalysis with lattice-based value tracking |
| src/test/kotlin/DataflowAnalysisTest.kt | Adds 7 tests for constant propagation covering arithmetic operations, reassignment, unknown variables, and branching scenarios; includes 2 smoke tests for refactored gen-kill framework analyses |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| override fun gen(block: BasicBlock): Set<String> { | ||
| // Gen = use (variables used in the block) | ||
| val use = mutableSetOf<String>() | ||
| val def = mutableSetOf<String>() | ||
|
|
||
| // Process each statement in the block (in reverse for backward analysis) | ||
| // Process in reverse for backward analysis | ||
| for (stmt in block.stmts.reversed()) { | ||
| when (stmt) { | ||
| is Stmt.LetStmt -> { | ||
| def.add(stmt.name) | ||
| use.addAll(getUsedVariables(stmt.expr)) | ||
| } | ||
| is Stmt.AssignStmt -> { | ||
| when (val lhs = stmt.lhs) { | ||
| is Expr.VarExpr -> { | ||
| def.add(lhs.name) | ||
| } | ||
| else -> { | ||
| // For complex assignments, we don't track them yet | ||
| } | ||
| } | ||
| use.addAll(getUsedVariables(stmt.expr)) | ||
| } | ||
| is Stmt.LetStmt -> use.addAll(getUsedVariables(stmt.expr)) | ||
| is Stmt.AssignStmt -> use.addAll(getUsedVariables(stmt.expr)) | ||
| is Stmt.PrintStmt -> { | ||
| for (arg in stmt.args) { | ||
| use.addAll(getUsedVariables(arg)) | ||
| } | ||
| } | ||
| is Stmt.ExprStmt -> { | ||
| use.addAll(getUsedVariables(stmt.expr)) | ||
| } | ||
| is Stmt.ReturnStmt -> { | ||
| use.addAll(getUsedVariables(stmt.expr)) | ||
| } | ||
| is Stmt.IfStmt -> { | ||
| use.addAll(getUsedVariables(stmt.condition)) | ||
| } | ||
| is Stmt.WhileStmt -> { | ||
| use.addAll(getUsedVariables(stmt.condition)) | ||
| } | ||
| else -> { | ||
| // Other statements don't use/define variables | ||
| } | ||
| is Stmt.ExprStmt -> use.addAll(getUsedVariables(stmt.expr)) | ||
| is Stmt.ReturnStmt -> use.addAll(getUsedVariables(stmt.expr)) | ||
| is Stmt.IfStmt -> use.addAll(getUsedVariables(stmt.condition)) | ||
| is Stmt.WhileStmt -> use.addAll(getUsedVariables(stmt.condition)) | ||
| else -> {} | ||
| } | ||
| } | ||
| return use | ||
| } | ||
|
|
||
| // IN = (OUT - def) ∪ use | ||
| return (input - def) + use | ||
| override fun kill(block: BasicBlock): Set<String> { | ||
| // Kill = def (variables defined in the block) | ||
| val def = mutableSetOf<String>() | ||
| for (stmt in block.stmts.reversed()) { | ||
| when (stmt) { | ||
| is Stmt.LetStmt -> def.add(stmt.name) | ||
| is Stmt.AssignStmt -> { | ||
| when (val lhs = stmt.lhs) { | ||
| is Expr.VarExpr -> def.add(lhs.name) | ||
| else -> {} | ||
| } | ||
| } | ||
| else -> {} | ||
| } | ||
| } | ||
| return def | ||
| } |
There was a problem hiding this comment.
The gen() and kill() methods for live variables analysis don't correctly handle intra-block dataflow. For backward analysis, when computing use-def sets within a block, you need to process statements in reverse order and track which variables are used before being defined.
Current implementation: Both methods iterate in reverse and independently collect all uses and defs.
Problem: Consider x = 1; y = x; - In the second statement, x is used, but it was just defined in the first statement. When processing backwards, we should recognize that x is NOT live before the first statement (it's defined before use within the block).
The correct approach is to process statements in reverse order once, maintaining a running set of live variables, adding to USE when a variable is used (and not already in DEF), and adding to DEF when defined.
Suggested fix:
override fun gen(block: BasicBlock): Set<String> {
val use = mutableSetOf<String>()
val def = mutableSetOf<String>()
for (stmt in block.stmts.reversed()) {
// First collect uses (before they might be killed by defs in this stmt)
when (stmt) {
is Stmt.LetStmt -> {
use.addAll(getUsedVariables(stmt.expr) - def)
def.add(stmt.name)
}
is Stmt.AssignStmt -> {
use.addAll(getUsedVariables(stmt.expr) - def)
when (val lhs = stmt.lhs) {
is Expr.VarExpr -> def.add(lhs.name)
else -> {}
}
}
// ... handle other statement types
}
}
return use
}| for (varName in allVars) { | ||
| // Meet of constant values: if all have same constant, use it; otherwise BOTTOM | ||
| val varValues = values.mapNotNull { it[varName] } | ||
| if (varValues.isEmpty()) { | ||
| // Variable not in any predecessor | ||
| continue | ||
| } | ||
|
|
||
| // Start with the first value | ||
| var meetValue: ConstantValue = varValues[0] | ||
| for (i in 1 until varValues.size) { | ||
| meetValue = meetConstantValues(meetValue, varValues[i]) | ||
| } | ||
| result[varName] = meetValue | ||
| } | ||
|
|
||
| return result |
There was a problem hiding this comment.
The meet operation doesn't correctly handle variables that exist in some predecessor flows but not others.
When varValues.size < values.size, it means some predecessors have information about the variable while others don't. The correct interpretation is that missing values should be treated as ⊤ (no information yet, not defined on that path).
However, the current code doesn't distinguish between:
- Variable not defined on a path (should be ⊤)
- Variable is defined but we have no information (also ⊤, but different semantically)
For constant propagation, if a variable is defined on one path but not on another, after the merge the variable should either be absent from the map (if treating undefined variables as ⊤), or be explicitly set based on meeting the defined value with ⊤.
Current behavior: If x=10 on one path and x is not in the map on another path, the meet ignores the missing value and returns x=10, which is incorrect.
Suggested fix: Check if varValues.size < values.size, and if so, the variable is not consistently defined across all paths, so treat it appropriately (likely set to ⊤ or exclude from result depending on semantics).
| * OUT = (IN - kill) ∪ gen (for forward analysis) | ||
| * IN = (OUT - kill) ∪ gen (for backward analysis) | ||
| * |
There was a problem hiding this comment.
The documentation comment is misleading about how the gen-kill pattern handles different analysis directions.
The comment states:
OUT = (IN - kill) ∪ gen (for forward analysis)
IN = (OUT - kill) ∪ gen (for backward analysis)
But the transfer() function only implements one formula at line 284. The framework works because the input parameter semantically represents different things based on direction:
- For forward analysis:
inputis IN, and we compute OUT - For backward analysis:
inputis OUT, and we compute IN
The documentation should clarify that the same formula applies in both cases, with input and the return value having different semantic meanings based on the analysis direction. The direction-specific behavior is handled by the worklist solver, not the transfer function.
Suggested documentation update:
/**
* Generic gen-kill framework for dataflow analysis
* This framework implements the standard gen-kill pattern where:
* output = (input - kill) ∪ gen
*
* For forward analysis: input=IN, output=OUT
* For backward analysis: input=OUT, output=IN
*
* The worklist solver handles the direction-specific logic.
*/| * OUT = (IN - kill) ∪ gen (for forward analysis) | |
| * IN = (OUT - kill) ∪ gen (for backward analysis) | |
| * | |
| * output = (input - kill) ∪ gen | |
| * | |
| * For forward analysis: input=IN, output=OUT | |
| * For backward analysis: input=OUT, output=IN | |
| * | |
| * The worklist solver handles the direction-specific logic. |



Adds a generic gen-kill framework for dataflow analysis and implements constant propagation. The existing reaching definitions and live variables analyses are refactored to use the framework.
Changes
SetLattice
SetLattice<E>: Reusable lattice class providing set operations for gen-kill analysesdifference(),union(), andmeet()operationsGeneric Gen-Kill Framework
GenKillAnalysis<T>: Abstract class implementing the gen-kill patternOUT = (IN - kill) ∪ genT(notSet<T>)SetLattice<T>as constructor parametergen()andkill()methodsRefactored Analyses
ReachingDefinitionsAnalysis: Now extendsGenKillAnalysis<String>withSetLattice<String>LiveVariablesAnalysis: Now extendsGenKillAnalysis<String>withSetLattice<String>WorklistSolverand maintain identical semanticsConstant Propagation
ConstantPropagationAnalysis: Tracks constant values using lattice{⊤, ⊥, Constant(Double)}Constant(c1) ⊓ Constant(c2) = c1 == c2 ? c1 : ⊥Example Usage
Testing
Added 7 tests for constant propagation covering arithmetic, branching, and edge cases. All 18 dataflow tests pass.
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.