Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,410 changes: 1,395 additions & 15 deletions cherry-diff.patch

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions docs/horizon-aware-planning.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,22 @@ This is not a UI redesign.
The horizon subsystem is generic. It accepts injected policy and transition
functions and does not import solver internals.

## Expected-Value Overlay

PR12 adds a separate expected-value wrapper for horizon rollout. Deterministic
`runHorizonRollout` remains the base primitive.

Expected-value rollout samples labeled numeric uncertainty before each
deterministic rollout, then aggregates utility in `utility_usd_cents` through an
explicit utility extractor. Transition functions still receive concrete state,
not distributions.

Expected-value output is labeled as expectation. It must not be described as a
guaranteed future outcome.

## Related docs

- `docs/engine-time-semantics.md`
- `docs/engine-optimality/trace.md`
- `docs/simulation/objective-semantics.md`
- `docs/simulation/uncertainty-modeling.md`
16 changes: 15 additions & 1 deletion docs/simulation/objective-semantics.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Status: Active
Last updated: 2026-04-28
Last updated: 2026-04-29

# Objective Semantics

Expand Down Expand Up @@ -60,15 +60,29 @@ does not define a true global utility function.

Cherry does not claim long-horizon global optimality. It ranks the currently generated candidate set under a documented, unit-consistent objective.

### Expected-value overlay

Expected-value simulation aggregates sample utility in the same canonical
`utility_usd_cents` unit. It is an overlay on deterministic scoring, not a
replacement for the live objective.

Variance stays in utility-space. Risk-adjusted utility, when used, is computed
from expected utility and variance with an explicit dimensionless risk
coefficient. Expected-value explanations must label outputs as expectations,
not guarantees.

## Future/Target behavior

- Any new score dimension must either convert into `objectiveUtilityCents` or be
explicitly documented as a bounded non-utility heuristic contribution.
- Any change to live objective semantics must update this document, tests, and
engine behavior versioning.
- Add uncertainty or risk metrics only when their unit semantics remain explicit
and bounded.

## Related docs

- `docs/engine-optimality/objective.md`
- `docs/engine-optimality/status.md`
- `docs/engine-optimality/candidate-space.md`
- `docs/simulation/uncertainty-modeling.md`
139 changes: 139 additions & 0 deletions docs/simulation/uncertainty-modeling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
Status: Active
Last updated: 2026-04-29

# Uncertainty Modeling

## Current behavior

Cherry's deterministic engine remains primary. PR12 adds an expected-value
overlay for simulation and horizon planning; it does not rewrite transition
functions or present-time recommendation semantics.

Uncertain inputs are numeric only. They must be represented as labeled
`UncertainNumber` values at leaf fields:

```ts
{
incomeCents: {
label: 'monthly_income',
distribution: { kind: 'lognormal', mu: 8, sigma: 0.1 },
},
}
```

Sampling happens before deterministic rollout. Transition functions receive
realized numeric state, never distributions.

Only `distribution` creates uncertainty-shape intent. `label` alone is always
ordinary domain data.

## Supported distributions

The supported numeric distributions are:

- `point(value)`
- `normal(mu, sigma)`
- `lognormal(mu, sigma)`
- `discrete(values, probs)`

Distribution parameters are validated before use. Discrete probabilities must
sum to 1. Nonnegative engine domains such as cents, income, expense, balances,
limits, cash, liquid amounts, rates, and utilization reject distributions that
can produce negative samples. Use `lognormal`, nonnegative `point`, or
nonnegative `discrete` values for positive-only financial quantities.

This domain validation is a PR12 path-name heuristic for common engine fields,
not a full semantic domain annotation system.

Represent event/value probability as `discrete(values=[0,X], probs=[1-p,p])`.

## Expected-value rollout

Expected-value rollout is exposed separately from deterministic rollout.

```txt
runHorizonRollout(...) -> deterministic projection
runExpectedValueHorizonRollout(...) -> expected-value projection
```

The EV wrapper realizes uncertain state once per sample, calls deterministic
rollout, extracts utility through an explicit `utilityOfRollout` callback, and
aggregates sample utility.

Sample count is bounded:

- minimum: `100`
- default: `500`
- maximum: `5000`

Computational cost is:

```txt
O(samples * horizon * transition cost)
```

## Reproducibility

EV runs require an explicit seed. The engine uses a deterministic seeded RNG;
EV engine paths must not call `Math.random`.

Explanations include the seed and sample count so a simulation can be
reproduced when the same inputs, policy, transition, and utility extractor are
used.

## Utility and risk units

Expected utility is aggregated in `utility_usd_cents`, the same canonical unit
as `objectiveUtilityCents`.

Variance remains in utility-space. PR12 implements only variance-based risk
adjustment:

```txt
riskAdjustedUtility = expectedUtility - lambda * variance
```

`lambda` is a dimensionless risk-aversion coefficient and defaults to `0`
risk-neutral behavior.

The type surface reserves future risk metric names for semivariance and CVaR,
but those are not implemented in PR12.

## Explanation contract

Expected-value explanations must label scalars as expectations. They include:

- labeled assumptions
- distribution strings
- seed
- sample count
- expected outcome
- variance
- risk inputs
- `uncertaintyLevel`
- `results are expectations, not guarantees`

`uncertaintyLevel` is a relative volatility classification using coefficient
of variation:

```txt
cv = sqrt(variance) / abs(expectedUtility)
```

- `low`: `cv < 0.10`
- `medium`: `0.10 <= cv <= 0.30`
- `high`: `cv > 0.30`
- `unknown`: expected utility is zero or variance is missing

## Future/Target behavior

- Add downside-aware risk metrics only when the explanation and unit semantics
are equally explicit.
- Do not use EV output in production recommendation surfaces until the model is
bounded, explainable, and runtime-verified.

## Related docs

- `docs/horizon-aware-planning.md`
- `docs/simulation/objective-semantics.md`
- `docs/engine-optimality/objective.md`
113 changes: 113 additions & 0 deletions lib/engine/explain/uncertainty.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
import {
collectUncertaintyAssumptions,
type UncertaintyAssumption,
} from '../uncertainty/policy.js';
import type {
NumericDistribution,
UncertaintyLevel,
UncertaintySeed,
} from '../uncertainty/types.js';

export type ExpectedValueAssumptionExplanation = {
label: string;
path: string;
distribution: string;
};

export type ExpectedValueUncertaintyExplanation = {
type: 'expected_value';
assumptions: readonly ExpectedValueAssumptionExplanation[];
seed: UncertaintySeed;
samples: number;
expectedOutcome: unknown;
expectedUtility: number;
variance?: number;
riskLambda: number;
riskAdjustedExpectedUtility?: number;
uncertaintyLevel: UncertaintyLevel;
confidenceNote: 'results are expectations, not guarantees';
};

export function formatNumericDistribution(d: NumericDistribution): string {
switch (d.kind) {
case 'point':
return `point(value=${d.value})`;
case 'normal':
return `normal(mu=${d.mean}, sigma=${d.std})`;
case 'lognormal':
return `lognormal(mu=${d.mu}, sigma=${d.sigma})`;
case 'discrete':
return `discrete(values=[${d.values.join(',')}], probs=[${d.probs.join(',')}])`;
default: {
const exhaustive: never = d;
throw new Error(`Unsupported distribution: ${JSON.stringify(exhaustive)}`);
}
}
}

export function classifyRelativeUncertainty(params: {
expectedUtility: number;
variance?: number;
}): UncertaintyLevel {
if (
params.variance === undefined ||
params.variance < 0 ||
!Number.isFinite(params.variance) ||
!Number.isFinite(params.expectedUtility) ||
params.expectedUtility === 0
) {
return 'unknown';
}

const cv = Math.sqrt(params.variance) / Math.abs(params.expectedUtility);
if (cv < 0.1) return 'low';
if (cv <= 0.3) return 'medium';
return 'high';
}

function explainAssumption(
assumption: UncertaintyAssumption
): ExpectedValueAssumptionExplanation {
return {
label: assumption.label,
path: assumption.path,
distribution: formatNumericDistribution(assumption.distribution),
};
}

export function buildExpectedValueUncertaintyExplanation(params: {
state: unknown;
seed: UncertaintySeed;
samples: number;
expectedOutcome: unknown;
expectedUtility: number;
variance?: number;
riskLambda?: number;
riskAdjustedExpectedUtility?: number;
}): ExpectedValueUncertaintyExplanation {
const riskLambda = params.riskLambda === undefined ? 0 : params.riskLambda;
const explanation: ExpectedValueUncertaintyExplanation = {
type: 'expected_value',
assumptions: collectUncertaintyAssumptions(params.state).map(explainAssumption),
seed: params.seed,
samples: params.samples,
expectedOutcome: params.expectedOutcome,
expectedUtility: params.expectedUtility,
riskLambda,
uncertaintyLevel: classifyRelativeUncertainty(
params.variance === undefined
? { expectedUtility: params.expectedUtility }
: { expectedUtility: params.expectedUtility, variance: params.variance }
),
confidenceNote: 'results are expectations, not guarantees',
};

if (params.variance !== undefined) {
explanation.variance = params.variance;
}
if (params.riskAdjustedExpectedUtility !== undefined) {
explanation.riskAdjustedExpectedUtility = params.riskAdjustedExpectedUtility;
}

return explanation;
}
Loading
Loading