diff --git a/CHANGELOG.md b/CHANGELOG.md index 561751b..618733e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,18 @@ Versioning: [SemVer](https://semver.org/spec/v2.0.0.html). ### Added +- **`pool.` source on per_entity_per_period facts.** Widens the + per-entity value-pool surface to the most common fact grain (one + row per entity per period). Two new dispatch handlers — + `_fact_scalar_pool` and `_fact_vec_pool` — register against + `BuilderKind.PER_ENTITY_PER_PERIOD_FACT_{SCALAR,VECTORIZED}` and + draw uniformly from the row's entity's pool list. Pool sources + remain rejected on per_period facts (no per-row entity binding), + reference dims, and sub-entity dims. Pairs naturally with `cdc: + true` on the same fact, so a column like `payment_type: + pool.payment_method` now works alongside SCD2 and CDC on a single + transactional table. + - **Parent/child fact grain + sibling-fact references.** Three composable patterns for multi-fact stars: - **Header / detail** — a `per_parent_row` child fact fans out diff --git a/docs/site/column-types.md b/docs/site/column-types.md index deceff0..56be247 100644 --- a/docs/site/column-types.md +++ b/docs/site/column-types.md @@ -31,7 +31,7 @@ Some types take additional fields (`labels` for `bucket`, `tracks` / | `faker.{kind}` | yes | yes | yes | yes | | `geo.{field}` | yes (dim only) | — | — | — | | `static.{value}` | yes | yes | yes | yes | -| `pool.{attr}` | yes (per-entity dim only) | yes (variable-grain + per_parent_row) | yes | — | +| `pool.{attr}` | yes (per-entity dim only) | yes (per_entity_per_period + variable-grain + per_parent_row) | yes | — | | `range` | — | yes | yes | — | | `segment.count` | yes (per-entity dim only) | — | — | — | | `timestamp` | — | — | yes | — | @@ -235,11 +235,11 @@ dimensions: Output dtype is `string`. -**Valid on**: per-entity dimension columns, variable-grain fact -columns, per_parent_row child-fact columns, and event columns. The -engine reads the row's entity FK and draws from -`attributes[attr_name]` for that entity's segment. -Per_entity_per_period and per_period facts, reference dims, and +**Valid on**: per-entity dimension columns, per_entity_per_period +fact columns, variable-grain fact columns, per_parent_row child-fact +columns, and event columns. The engine reads the row's entity FK +and draws from `attributes[attr_name]` for that entity's segment. +Per_period facts (the `dim_date`-style grain), reference dims, and sub-entity dims are out of scope — pool dispatch requires either a per-row entity binding (facts / events) or a 1:1 row-to-entity mapping (per_entity dim). diff --git a/docs/site/feature-reference.md b/docs/site/feature-reference.md index 9bbdb57..a08c8a5 100644 --- a/docs/site/feature-reference.md +++ b/docs/site/feature-reference.md @@ -123,7 +123,7 @@ is no longer byte-identical to a pre-flag run of the same file. | Geo bundle provider | `geo.` column types pull country / region / city / postcode / lat-lng from a curated 200-entry, 17-country reference dataset. All fields on the same dim row come from a single bundle, so the city is in the stated country, the postcode looks right for that country, and lat/lng land on the named city. Dim-only; the engine rejects geo on facts/events. See [Geo hierarchy](./user-guide/geo-hierarchy.md). | | Faker-backed text + identifiers | PII-shape providers wired into the engine: `name`, `email`, `phone_number`, `company`, `address`, `postcode`, `country`, `city`, `latitude`, `longitude`, `sentence`. Deterministic under the run seed. Useful for masking exercises and regex-validation scenarios; **does not read entity, archetype, or trajectory** (each call is an independent draw). | | Range source | `type: range` with `range: [min, max]` on fact / event columns produces a per-row uniform draw between the bounds. Integer bounds → `dtype: int` and inclusive upper bound; float bounds → `dtype: float` and exclusive upper bound (numpy conventions). Use it for `quantity ∈ [1, 5]`, `unit_price ∈ [10.0, 500.0]`, and similar shape constraints that `faker.random_int` / `faker.pyfloat` express less precisely. Deterministic under seed. | -| Pool source on facts and events | `type: pool.` lifts the per-entity value pool (previously dim-only) onto variable-grain facts, per_parent_row child facts, and event tables. Every row resolves to its entity's segment, then draws uniformly from `attributes[]` — so a `loyal` cohort customer's `channel` always lands in `[app, web]` while a `casual` customer's lands in `[sms, email]`. | +| Pool source on facts and events | `type: pool.` lifts the per-entity value pool (previously dim-only) onto per_entity_per_period facts, variable-grain facts, per_parent_row child facts, and event tables. Every row resolves to its entity's segment, then draws uniformly from `attributes[]` — so a `loyal` cohort customer's `channel` always lands in `[app, web]` while a `casual` customer's lands in `[sms, email]`. Per_period facts (the `dim_date`-style grain) remain out of scope — those rows have no per-row entity binding. | | Narrative text source (trajectory-aware) | Per-archetype lexicons + a sentence template rendered into a `narrative` column on a fact table. Output vocabulary tracks the entity's trajectory position (a high-position `growth` entity produces systematically different text than a low-position `decline` entity); a simple bag-of-words classifier hits ≥0.55 accuracy on archetype prediction. Deterministic under seed; preserves the trajectory-first invariant. **Fact-only** (rejected on dim / event tables at config load). **Performance:** forces the scalar fact builder path (~3-10× slower than vectorized metric-only facts), so keep narrative on tables that genuinely need text. Bundled template `narrative_reviews`. See [Narrative source](./user-guide/narrative-source.md). | ### 7. Audit + downstream-pipeline outputs diff --git a/plotsim/tables.py b/plotsim/tables.py index f7f907b..dc8a21f 100644 --- a/plotsim/tables.py +++ b/plotsim/tables.py @@ -1259,6 +1259,50 @@ def _fact_vec_range(parsed: RangeSource, ctx: dict): return rng.uniform(parsed.min, parsed.max, size=total_rows) +def _fact_vec_pool(parsed: PoolSource, ctx: dict): + """Bulk per-row pool draw on a vectorized per_entity_per_period fact. + + Output is entity-major (matches ``entity_pk_repeated`` / + ``date_key_tiled`` layout). One bulk ``rng.integers`` draw per + entity sized to ``n_periods``, scattered into the contiguous + entity block. Per-entity draws keep ordering stable when entities + have heterogeneous pool sizes. + """ + del parsed # PoolSource carries only a marker name; data is on col.value_pool. + col = ctx["col"] + rng = ctx["rng"] + if rng is None: + raise ValueError( + f"fact column {col.name!r} has source {col.source!r} but no " + f"RNG was supplied to the vectorized fact builder; pool " + f"draws require the per-table RNG" + ) + if col.value_pool is None: + raise ValueError( + f"fact column {col.name!r} declares pool source {col.source!r} " + f"but Column.value_pool is None; Column._pool_pairing should " + f"have rejected this at load" + ) + config = ctx["config"] + n_periods = ctx["n_periods"] + total_rows = ctx["total_rows"] + out = np.empty(total_rows, dtype=object) + cursor = 0 + for entity in config.entities: + choices = col.value_pool.get(entity.name) + if choices is None: + raise ValueError( + f"fact column {col.name!r} value_pool has no entry for " + f"entity {entity.name!r}; validate_value_pool_coverage " + f"should have caught this at load" + ) + indices = rng.integers(0, len(choices), size=n_periods) + for k in range(n_periods): + out[cursor + k] = _coerce_static(choices[int(indices[k])], col.dtype) + cursor += n_periods + return out + + def _fact_vec_text_bucket(parsed: TextBucketSource, ctx: dict): # M105: trajectory-position-driven text emission. ``trajectories_2d`` # is shape (E, P); flatten in the same row-major (entity, period) @@ -1347,6 +1391,11 @@ def _fact_vec_unsupported(parsed: Any, ctx: dict): RangeSource, _fact_vec_range, ) +COLUMN_DISPATCH.register( + BuilderKind.PER_ENTITY_PER_PERIOD_FACT_VECTORIZED, + PoolSource, + _fact_vec_pool, +) COLUMN_DISPATCH.register_unsupported( BuilderKind.PER_ENTITY_PER_PERIOD_FACT_VECTORIZED, _fact_vec_unsupported, @@ -1620,6 +1669,43 @@ def _fact_scalar_range(parsed: RangeSource, ctx: dict): return float(rng.uniform(parsed.min, parsed.max)) +def _fact_scalar_pool(parsed: PoolSource, ctx: dict): + """Per-cell pool draw on a scalar per_entity_per_period fact. + + Looks up the per-entity choice list on ``col.value_pool`` keyed by + the current row's entity name, then draws one index from the + seeded RNG. Same shape as ``_evt_row_pool`` but the entity is + already in ctx (no PK reverse-lookup needed on the per-entity + dim). + """ + del parsed # PoolSource carries only a marker name; data is on col.value_pool. + col = ctx["col"] + rng = ctx["rng"] + entity = ctx["entity"] + if rng is None or entity is None: + raise ValueError( + f"fact column {col.name!r} pool source needs both `entity` and " + f"`rng` in ctx (got entity={entity!r}, " + f"rng={'set' if rng is not None else 'None'}); this is an " + f"internal wiring bug, not a config error" + ) + if col.value_pool is None: + raise ValueError( + f"fact column {col.name!r} declares pool source {col.source!r} " + f"but Column.value_pool is None; Column._pool_pairing should " + f"have rejected this at load" + ) + choices = col.value_pool.get(entity.name) + if choices is None: + raise ValueError( + f"fact column {col.name!r} value_pool has no entry for entity " + f"{entity.name!r}; validate_value_pool_coverage should have " + f"caught this at load" + ) + pick = int(rng.integers(0, len(choices))) + return _coerce_static(choices[pick], col.dtype) + + def _fact_scalar_unsupported(parsed: Any, ctx: dict): col = ctx["col"] raise ValueError( @@ -1691,6 +1777,11 @@ def _fact_scalar_unsupported(parsed: Any, ctx: dict): RangeSource, _fact_scalar_range, ) +COLUMN_DISPATCH.register( + BuilderKind.PER_ENTITY_PER_PERIOD_FACT_SCALAR, + PoolSource, + _fact_scalar_pool, +) COLUMN_DISPATCH.register_unsupported( BuilderKind.PER_ENTITY_PER_PERIOD_FACT_SCALAR, _fact_scalar_unsupported, diff --git a/plotsim/validation.py b/plotsim/validation.py index a9ed0ef..84845fc 100644 --- a/plotsim/validation.py +++ b/plotsim/validation.py @@ -581,9 +581,9 @@ def validate_value_pool_coverage(config: PlotsimConfig) -> list[str]: 1. ``PoolSource`` columns are only meaningful on tables where every row resolves to exactly one entity and where the engine has wired the per-row entity → pool lookup: per_entity - dims (M114), variable-grain fact tables, per_parent_row - child facts, and event tables. Per_entity_per_period facts, - per_period facts, reference dims, and sub-entity dims are + dims (M114), per_entity_per_period facts, variable-grain + fact tables, per_parent_row child facts, and event tables. + Per_period facts, reference dims, and sub-entity dims are out of scope — either no per-row entity binding or no dispatch handler is registered for the grain. 2. The ``value_pool`` dict's keys must cover every ``Entity.name`` @@ -597,7 +597,10 @@ def validate_value_pool_coverage(config: PlotsimConfig) -> list[str]: add variable-grain facts, per_parent_row child facts, and event tables so authors can curate per-entity value pools on the fact / event rows directly (e.g. ``payment_method`` on ``fct_orders``) - without the indirection of a separate dim-row lookup. + without the indirection of a separate dim-row lookup. A follow-up + widened it again to include per_entity_per_period facts once + ``_fact_scalar_pool`` / ``_fact_vec_pool`` landed in the column + dispatch registry. """ errors: list[str] = [] entity_names = {e.name for e in config.entities} @@ -607,7 +610,8 @@ def validate_value_pool_coverage(config: PlotsimConfig) -> list[str]: pool_capable_tables = per_entity_dim_names | { t.name for t in config.tables - if (t.type == "fact" and t.grain in ("variable", "per_parent_row")) or t.type == "event" + if (t.type == "fact" and t.grain in ("per_entity_per_period", "variable", "per_parent_row")) + or t.type == "event" } for tbl in config.tables: @@ -619,10 +623,11 @@ def validate_value_pool_coverage(config: PlotsimConfig) -> list[str]: errors.append( f"table {tbl.name!r} column {col.name!r} declares a " f"'pool:' source but the table is not a per_entity " - f"dim, a variable-grain fact, a per_parent_row child " - f"fact, or an event (type={tbl.type!r}, " - f"grain={tbl.grain!r}); pool sources need a per-row " - f"per-entity binding the engine can dispatch against" + f"dim, a per_entity_per_period fact, a variable-grain " + f"fact, a per_parent_row child fact, or an event " + f"(type={tbl.type!r}, grain={tbl.grain!r}); pool " + f"sources need a per-row per-entity binding the " + f"engine can dispatch against" ) continue if col.value_pool is None: diff --git a/tests/test_pool_attr.py b/tests/test_pool_attr.py index e2089c8..5ea3ad8 100644 --- a/tests/test_pool_attr.py +++ b/tests/test_pool_attr.py @@ -248,62 +248,73 @@ def test_pool_attr_missing_on_some_segments_raises(): ) -def test_pool_attr_on_per_entity_per_period_fact_rejected(): - """``pool.{attr}`` is now valid on variable-grain facts, - per_parent_row child facts, and event tables (0.6-M19 Fix 1), but - the per_entity_per_period fact grain stays out of scope — the - engine has no per-row pool dispatch handler registered for that - grain. The engine validator surfaces the gap at config load. +def test_pool_attr_on_per_entity_per_period_fact_accepted(): + """``pool.{attr}`` on a per_entity_per_period fact column now + interprets cleanly and wires through ``_fact_scalar_pool`` / + ``_fact_vec_pool``. The widening over the M19-Fix-1 baseline + (which accepted variable-grain facts, per_parent_row children, + and events) covers the most common fact grain — one row per + (entity, period). """ - with pytest.raises(ValueError, match="variable-grain fact"): - interpret( - _input( - segments=[ - { - "name": "alpha", - "count": 3, - "archetype": "growth", - "attributes": {"industry": ["Tech"]}, - }, - { - "name": "beta", - "count": 3, - "archetype": "flat", - "attributes": {"industry": ["Healthcare"]}, - }, - ], - dimensions=[ - { - "name": "dim_date", - "per": "period", - "columns": [ - {"name": "date_key", "type": "id"}, - {"name": "date", "type": "date"}, - ], - }, - { - "name": "dim_company", - "per": "unit", - "columns": [ - {"name": "company_id", "type": "id"}, - ], - }, - ], - facts=[ - { - "name": "fct_company", - "metrics": ["engagement"], - "columns": [ - {"name": "date_key", "type": "ref.dim_date"}, - {"name": "company_id", "type": "ref.dim_company"}, - {"name": "engagement", "type": "metric.engagement"}, - # Illegal: pool.{attr} on a fact column - {"name": "industry", "type": "pool.industry"}, - ], - }, - ], - ) + cfg = interpret( + _input( + segments=[ + { + "name": "alpha", + "count": 3, + "archetype": "growth", + "attributes": {"industry": ["Tech", "Finance"]}, + }, + { + "name": "beta", + "count": 3, + "archetype": "flat", + "attributes": {"industry": ["Healthcare"]}, + }, + ], + dimensions=[ + { + "name": "dim_date", + "per": "period", + "columns": [ + {"name": "date_key", "type": "id"}, + {"name": "date", "type": "date"}, + ], + }, + { + "name": "dim_company", + "per": "unit", + "columns": [ + {"name": "company_id", "type": "id"}, + ], + }, + ], + facts=[ + { + "name": "fct_company", + "metrics": ["engagement"], + "columns": [ + {"name": "date_key", "type": "ref.dim_date"}, + {"name": "company_id", "type": "ref.dim_company"}, + {"name": "engagement", "type": "metric.engagement"}, + {"name": "industry", "type": "pool.industry"}, + ], + }, + ], ) + ) + fct = next(t for t in cfg.tables if t.name == "fct_company") + industry_col = next(c for c in fct.columns if c.name == "industry") + assert industry_col.source == "pool:industry" + # Builder expands each segment into per-entity rows (alpha_0000, ...). + # Every entity in a segment shares that segment's attribute pool. + pool = industry_col.value_pool + assert pool is not None + alpha_keys = sorted(k for k in pool if k.startswith("alpha")) + beta_keys = sorted(k for k in pool if k.startswith("beta")) + assert len(alpha_keys) == 3 and len(beta_keys) == 3 + assert all(pool[k] == ["Tech", "Finance"] for k in alpha_keys) + assert all(pool[k] == ["Healthcare"] for k in beta_keys) # ── Auto-schema attribute columns ────────────────────────────────────────── diff --git a/tests/test_pool_on_fact.py b/tests/test_pool_on_fact.py new file mode 100644 index 0000000..2d82f95 --- /dev/null +++ b/tests/test_pool_on_fact.py @@ -0,0 +1,287 @@ +"""PoolSource on per_entity_per_period fact tables. + +Widens the M114 / M19-fix-1 pool-source surface to include the most +common fact grain: one row per (entity, period). Before this change, +``validate_value_pool_coverage`` rejected the combination at load and +no dispatch handler existed in ``COLUMN_DISPATCH`` for +``BuilderKind.PER_ENTITY_PER_PERIOD_FACT_{SCALAR,VECTORIZED}`` × ``PoolSource``. + +Coverage: + +* End-to-end: a fact column with ``pool:`` source emits a value drawn + from the row's entity's own pool — entities never cross-contaminate. +* Determinism: same ``(config, seed)`` → identical column. +* Forced-scalar path: when a Faker column is on the same fact, the + builder switches to the scalar path; the pool column must still + resolve correctly there. +* Per_period fact (the dim_date-style grain) is still rejected — the + widening doesn't accidentally cover grains without an entity binding. +""" + +from __future__ import annotations + +import warnings + +import pytest + +import plotsim +from plotsim.config import ( + Archetype, + Column, + CurveSegment, + Domain, + Entity, + Metric, + OutputConfig, + PlotsimConfig, + SurrogateKeyWarning, + Table, + TimeWindow, +) +from plotsim.tables import generate_tables + + +def _flat_archetype() -> Archetype: + return Archetype( + name="flat", + label="flat", + description="constant 0.5 plateau", + curve_segments=[ + CurveSegment( + curve="plateau", + params={"level": 0.5}, + start_pct=0.0, + end_pct=1.0, + ), + ], + ) + + +def _config_with_pool_fact_column( + *, + seed: int = 42, + extra_fact_cols: list[Column] | None = None, +) -> PlotsimConfig: + """Build a minimal config: 2 entities × 4 periods, fact has + metric + pool column (and optionally any caller-supplied extras).""" + entities = [ + Entity(name="cohort_a", archetype="flat", size=1), + Entity(name="cohort_b", archetype="flat", size=1), + ] + pool_col = Column( + name="payment_type", + dtype="string", + source="pool:payment_type", + value_pool={ + "cohort_a": ["card", "cash"], + "cohort_b": ["online", "wallet"], + }, + ) + fact_cols = [ + Column(name="date_key", dtype="id", source="fk:dim_date.date_key"), + Column(name="entity_id", dtype="id", source="fk:dim_entity.entity_id"), + Column(name="m", dtype="float", source="metric:m"), + pool_col, + ] + if extra_fact_cols: + fact_cols.extend(extra_fact_cols) + fct = Table( + name="fct_m", + type="fact", + grain="per_entity_per_period", + primary_key=["date_key", "entity_id"], + foreign_keys=["dim_date.date_key", "dim_entity.entity_id"], + columns=fact_cols, + ) + dim_date = Table( + name="dim_date", + type="dim", + grain="per_period", + primary_key="date_key", + columns=[ + Column(name="date_key", dtype="id", source="pk"), + Column(name="date", dtype="date", source="generated:date_key"), + ], + ) + dim_entity = Table( + name="dim_entity", + type="dim", + grain="per_entity", + primary_key="entity_id", + columns=[Column(name="entity_id", dtype="id", source="pk")], + ) + with warnings.catch_warnings(): + warnings.simplefilter("ignore", SurrogateKeyWarning) + return PlotsimConfig( + domain=Domain( + name="t", + description="t", + entity_type="entity", + entity_label="Entities", + ), + time_window=TimeWindow( + start="2024-01", + end="2024-04", + granularity="monthly", + ), + seed=seed, + metrics=[ + Metric( + name="m", + label="m", + distribution="normal", + params={"mu": 1.0, "sigma": 0.1}, + polarity="positive", + ), + ], + archetypes=[_flat_archetype()], + entities=entities, + tables=[dim_date, dim_entity, fct], + output=OutputConfig(format="csv", directory="out/test_pool_fact"), + ) + + +def _assert_pools_honored(fact_df, dim_entity_df): + """Every row's payment_type ∈ that row's entity's declared pool.""" + expected = { + "cohort_a": {"card", "cash"}, + "cohort_b": {"online", "wallet"}, + } + # dim_entity rows are in config.entities order. + pk_to_name = { + dim_entity_df.iloc[i]["entity_id"]: ["cohort_a", "cohort_b"][i] + for i in range(len(dim_entity_df)) + } + for _, row in fact_df.iterrows(): + entity_name = pk_to_name[row["entity_id"]] + assert row["payment_type"] in expected[entity_name], ( + f"row entity_id={row['entity_id']} (cohort {entity_name}) drew " + f"payment_type={row['payment_type']!r}, not in pool " + f"{expected[entity_name]}" + ) + + +def test_pool_on_per_entity_per_period_fact_loads_and_dispatches(): + """Build the config (was rejected pre-fix) and generate without error.""" + cfg = _config_with_pool_fact_column() + tables = generate_tables(cfg) + fact = tables["fct_m"] + assert len(fact) == 8 # 2 entities × 4 periods + assert "payment_type" in fact.columns + + +def test_pool_values_match_per_row_entity_pool_vec_path(): + """Vectorized path: no Faker column on the fact, so ``forces_scalar`` + is False and the vec dispatcher runs ``_fact_vec_pool``.""" + cfg = _config_with_pool_fact_column() + tables = generate_tables(cfg) + _assert_pools_honored(tables["fct_m"], tables["dim_entity"]) + + +def test_pool_values_match_per_row_entity_pool_scalar_path(): + """Forced-scalar path: a Faker column flips ``forces_scalar=True`` + so every column (including pool) routes through the scalar + dispatcher and ``_fact_scalar_pool``.""" + cfg = _config_with_pool_fact_column( + extra_fact_cols=[ + Column(name="note", dtype="string", source="generated:faker.word"), + ], + ) + tables = generate_tables(cfg) + _assert_pools_honored(tables["fct_m"], tables["dim_entity"]) + + +def test_pool_on_fact_is_deterministic_under_seed(): + """Same seed → byte-identical column across two independent builds.""" + cfg_a = _config_with_pool_fact_column(seed=2026) + cfg_b = _config_with_pool_fact_column(seed=2026) + fa = generate_tables(cfg_a)["fct_m"]["payment_type"].tolist() + fb = generate_tables(cfg_b)["fct_m"]["payment_type"].tolist() + assert fa == fb + + +def test_pool_on_fact_seed_change_changes_draws(): + """Sanity: changing seed shifts the draws (rules out a constant- + return bug that would also pass the determinism test).""" + fa = generate_tables(_config_with_pool_fact_column(seed=1))["fct_m"]["payment_type"].tolist() + fb = generate_tables(_config_with_pool_fact_column(seed=2))["fct_m"]["payment_type"].tolist() + assert fa != fb + + +def test_pool_still_rejected_on_per_period_fact(): + """The widening adds per_entity_per_period — per_period (the + dim_date-style grain) still has no per-row entity binding, so + the validator must keep rejecting it.""" + entity = Entity(name="e1", archetype="flat", size=1) + pool_col = Column( + name="payment_type", + dtype="string", + source="pool:payment_type", + value_pool={"e1": ["card"]}, + ) + bad_per_period_fact = Table( + name="fct_period", + type="fact", + grain="per_period", + primary_key="date_key", + foreign_keys=["dim_date.date_key"], + columns=[ + Column(name="date_key", dtype="id", source="fk:dim_date.date_key"), + pool_col, + ], + ) + dim_date = Table( + name="dim_date", + type="dim", + grain="per_period", + primary_key="date_key", + columns=[ + Column(name="date_key", dtype="id", source="pk"), + Column(name="date", dtype="date", source="generated:date_key"), + ], + ) + dim_entity = Table( + name="dim_entity", + type="dim", + grain="per_entity", + primary_key="entity_id", + columns=[Column(name="entity_id", dtype="id", source="pk")], + ) + with warnings.catch_warnings(): + warnings.simplefilter("ignore", SurrogateKeyWarning) + with pytest.raises(ValueError, match="per_entity_per_period fact"): + PlotsimConfig( + domain=Domain( + name="t", + description="t", + entity_type="entity", + entity_label="Entities", + ), + time_window=TimeWindow( + start="2024-01", + end="2024-04", + granularity="monthly", + ), + seed=0, + metrics=[ + Metric( + name="m", + label="m", + distribution="normal", + params={"mu": 1.0, "sigma": 0.1}, + polarity="positive", + ), + ], + archetypes=[_flat_archetype()], + entities=[entity], + tables=[dim_date, dim_entity, bad_per_period_fact], + output=OutputConfig( + format="csv", + directory="out/test_pool_per_period", + ), + ) + + +def test_pool_source_reexport_still_works(): + """Smoke test the top-level re-export — same as the M114 surface.""" + assert plotsim.PoolSource is not None