Skip to content

Commit 96056d9

Browse files
committed
Report distinct variable values observed in TLC coverage statistics.
These statistics show the number of unique values each variable takes during model checking. An unusually high number of values for a particular variable may suggest that the model is not properly constrained, potentially leading to state space explosion during exhaustive analysis. Note I: The data structure used to estimate these counts is probabilistic—specifically, HyperLogLog—which helps minimize memory usage. As a result, the reported counts may have a small margin of error. Additionally, the use of this structure introduces contention among workers, which can negatively affect performance and scalability. However, empirical measurements (see tlaplus#1183 (comment)) have shown that the performance overhead of variable statistics collection on top of action and ordinary coverage is negligible. Note II: The `TLC!TLCGet("spec")` named register equals the same data and serves as a more appropriate and structured input for extracting and parsing these numeric values during subsequent processing stages: ```tla ---- MODULE Spec ---- EXTENDS TLC, Json ... MyStats == PrintT( ToJson( \* Alternatively, see CSV!CSVWrite operator. { [name |-> v.name, count |-> v.coverage.distinct] : v \in TLCGet("spec").variables } ) ) ==== ---- CONFIG Spec ---- ... _PERIODIC MyStats POSTCONDITION MyStats ==== ``` Variable statistics can be enabled independently of the `-coverage someTime` option by setting the Java system property `-Dtlc2.TLCGlobals.coverage=2` when running TLC. To activate both action and variable statistics, use `-Dtlc2.TLCGlobals.coverage=3`. [Feature][TLC] Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
1 parent 96778df commit 96056d9

4 files changed

Lines changed: 49 additions & 19 deletions

File tree

docs/module-coverage-statistics.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,11 @@ Spec ==
3333
=====
3434
```
3535

36-
When TLC runs this spec with coverage reporting enabled, it may produce output like this:
36+
When TLC runs this spec with coverage reporting enabled (and without deadlock checking), it produces output like this:
3737

3838
```
3939
The coverage statistics at 2025-04-02 18:02:30
40+
<x line 4, col 11 to line 4, col 11 of module Foobar>: 10
4041
<Init line 6, col 1 to line 6, col 4 of module Foobar>: 1:1
4142
line 7, col 5 to line 7, col 12 of module Foobar: 1
4243
<Inc line 9, col 1 to line 9, col 3 of module Foobar>: 10:10
@@ -54,11 +55,14 @@ End of statistics.
5455
```
5556

5657
## How to Interpret This Output
57-
Each block of coverage statistics corresponds to either a definition (like `Init`, `Inc`, or `Dec`) or an expression inside the definition.
58+
Each block of coverage statistics corresponds to either a variable declaration, a definition (like `Init`, `Inc`, or `Dec`), or an expression inside the definition.
59+
60+
### Variable Declaration:
61+
The line `<x line 4, col 11 to line 4, col 11 of module Foobar>:10` indicates that TLC found 10 distinct values for the (declared) variable `x`.
5862

5963
### State-level expressions (`Init`):
6064

61-
The line `<Init line 6, col 1 to line 6, col 4 of module Foobar>: 1:1` shows that TLC evaluated the `Init` predicate once, and it produced one initial state
65+
The line `<Init line 6, col 1 to line 6, col 4 of module Foobar>: 1:1` shows that TLC evaluated the `Init` predicate once, and it produced one initial state.
6266

6367
### Action-level expressions (`Inc` and `Dec`):
6468

@@ -74,7 +78,7 @@ In addition to tracking how many times an expression is evaluated, TLC also repo
7478

7579
This is especially relevant for expressions that manipulate sets, functions, sequences, or other compound structures. When such an allocation occurs, TLC appends a second number to the coverage entry in the format evaluations:cost.
7680

77-
Consider the following specification, where the Next action repeatedly adds a new element to the set x:
81+
Consider the following specification, where the `Next` action repeatedly adds a new element to the set `x`:
7882

7983
```tla
8084
------ MODULE Costs ------
@@ -96,6 +100,7 @@ Spec ==
96100
97101
```
98102
The coverage statistics at 2025-04-02 18:28:21
103+
<x line 4, col 11 to line 4, col 11 of module Foobar>:10
99104
Init line 6, col 1 to line 6, col 4 of module Foobar>: 1:1
100105
line 7, col 5 to line 7, col 10 of module Foobar: 1
101106
<Next line 9, col 1 to line 9, col 4 of module Foobar>: 10:10
@@ -111,6 +116,6 @@ End of statistics.
111116
112117
This tells us:
113118
114-
The sub-expression ({Cardinality(x) + 1}) was evaluated 10 times, and TLC incurred an allocation cost of 18 across those 10 evaluations. This cost represents internal overhead, such as memory allocation or structural copying involved in creating the new set value.
119+
The sub-expression `({Cardinality(x) + 1})` was evaluated 10 times, and TLC incurred an allocation cost of 18 across those 10 evaluations. This cost represents internal overhead, such as memory allocation or structural copying involved in creating the new set value.
115120
116121
These costs can highlight performance hotspots in your specification—helpful for optimizing large models where memory usage or computational effort may become significant.

tlatools/org.lamport.tlatools/src/tlc2/output/EC.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,8 @@ public interface EC
293293
public static final int TLC_COVERAGE_PROPERTY = 2774;
294294
public static final int TLC_COVERAGE_CONSTRAINT = 2778;
295295
public static final int TLC_COVERAGE_END_OVERHEAD = 2777;
296-
296+
public static final int TLC_COVERAGE_VAR = 2779;
297+
297298
// config file errors
298299
public static final int TLC_CONFIG_VALUE_NOT_ASSIGNED_TO_CONSTANT_PARAM = 2222;
299300
public static final int TLC_CONFIG_RHS_ID_APPEARED_AFTER_LHS_ID = 2223;

tlatools/org.lamport.tlatools/src/tlc2/output/MP.java

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1169,6 +1169,9 @@ else if (parameters.length == 2) {
11691169
case EC.TLC_COVERAGE_VALUE_COST:
11701170
b.append(" %1%: %2%:%3%");
11711171
break;
1172+
case EC.TLC_COVERAGE_VAR:
1173+
b.append("<%1% %2%>: %3%");
1174+
break;
11721175
case EC.TLC_COVERAGE_INIT:
11731176
b.append("%1%: %2%:%3%");
11741177
break;

tlatools/org.lamport.tlatools/src/tlc2/tool/coverage/CostModelCreator.java

Lines changed: 34 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@
4949
import tla2sany.semantic.Subst;
5050
import tla2sany.semantic.SubstInNode;
5151
import tla2sany.semantic.SymbolNode;
52+
import tla2sany.st.Location;
5253
import tlc2.output.EC;
5354
import tlc2.output.MP;
5455
import tlc2.tool.Action;
@@ -58,6 +59,7 @@
5859
import tlc2.util.ObjLongTable;
5960
import tlc2.util.Vect;
6061
import tlc2.util.statistics.CountDistinct;
62+
import util.UniqueString;
6163

6264
/**
6365
* <h1>Why a CostModel:</h1> Why a CostModelCreator to traverses the semantic
@@ -428,7 +430,38 @@ public static final void create(final ITool tool) {
428430
}
429431

430432
public static void report(final ITool tool, final long startTime) {
431-
MP.printMessage(EC.TLC_COVERAGE_START);
433+
report(tool);
434+
435+
// Notify users about the performance overhead related to coverage collection
436+
// after N minutes of model checking. The assumption is that a user has little
437+
// interest in coverage for a large (long-running) model anyway. In the future
438+
// it is hopefully possible to switch from profiling to sampling to relax the
439+
// performance overhead of coverage and cost statistics.
440+
final long l = System.currentTimeMillis() - startTime;
441+
if (l > (5L * 60L * 1000L)) {
442+
MP.printMessage(EC.TLC_COVERAGE_END_OVERHEAD);
443+
} else {
444+
MP.printMessage(EC.TLC_COVERAGE_END);
445+
}
446+
}
447+
448+
private static void report(final ITool tool) {
449+
MP.printMessage(EC.TLC_COVERAGE_START);
450+
451+
// VARIABLE and VARIABLES
452+
for (final OpDeclNode odn : tool.getSpecProcessor().getVariablesNodes()) {
453+
final long count = odn.getCountDistinct().count();
454+
// 'count' may be zero if report is evaluated before state-space exploration
455+
// begins. Luckily, Noop#count returns -1 in such cases.
456+
if (count >= 0) {
457+
final UniqueString varName = odn.getName();
458+
final Location location = odn.getLocation();
459+
MP.printMessage(EC.TLC_COVERAGE_VAR,
460+
new String[] { varName.toString(), location.toString(), String.valueOf(count) });
461+
}
462+
}
463+
464+
// INIT (or SPECIFICATION)
432465
final Vect<Action> init = tool.getInitStateSpec();
433466
for (int i = 0; i < init.size(); i++) {
434467
final Action initAction = init.elementAt(i);
@@ -495,17 +528,5 @@ public int compare(Action o1, Action o2) {
495528
impliedActions.cm.report();
496529
}
497530
}
498-
499-
// Notify users about the performance overhead related to coverage collection
500-
// after N minutes of model checking. The assumption is that a user has little
501-
// interest in coverage for a large (long-running) model anyway. In the future
502-
// it is hopefully possible to switch from profiling to sampling to relax the
503-
// performance overhead of coverage and cost statistics.
504-
final long l = System.currentTimeMillis() - startTime;
505-
if (l > (5L * 60L * 1000L)) {
506-
MP.printMessage(EC.TLC_COVERAGE_END_OVERHEAD);
507-
} else {
508-
MP.printMessage(EC.TLC_COVERAGE_END);
509-
}
510531
}
511532
}

0 commit comments

Comments
 (0)