Skip to content

Add 68 idiomatic PyMC model implementations#319

Open
twiecki wants to merge 3 commits intostan-dev:masterfrom
twiecki:pymc-idiomatic
Open

Add 68 idiomatic PyMC model implementations#319
twiecki wants to merge 3 commits intostan-dev:masterfrom
twiecki:pymc-idiomatic

Conversation

@twiecki
Copy link
Copy Markdown

@twiecki twiecki commented Mar 15, 2026

Summary

  • Add PyMC translations for 68 Stan models that follow idiomatic PyMC patterns
  • No pm.Potential, no custom helper functions
  • All numpy data wrangling occurs before the pm.Model context manager
  • Updated .info.json files to register PyMC implementations

Model categories included

  • GLMs: wells (8 variants), radon (10 variants), kidscore (7 variants), logearn (4 variants), logmesquite (2 variants)
  • Gaussian Processes: gp_regr, gp_pois_regr, accel_splines
  • Mixture models: normal_mixture, normal_mixture_k, low_dim_gauss_mix (2 variants)
  • IRT: irt_2pl
  • Hierarchical: election88_full, GLMM1_model, GLMM_Poisson_model, GLM_Poisson_model, pilots
  • Educational: eight_schools_centered, rats_model, diamonds, blr
  • Ecological: seeds (3 variants), dogs (2 variants), dugongs_model
  • Other: arK, logistic_regression_rhs, Rate models, mesquite, sesame_one_pred_a, etc.

What is NOT included (for separate PR)

  • Models using pm.Potential (40 models) - need review of whether Potentials can be eliminated
  • Models with custom helper functions (prophet, ODE models, simplex models, etc.)

Test plan

  • All models verified gradient-equivalent to Stan via BridgeStan (rtol=1e-5, atol=1e-6)
  • Zero pm.Potential usage across all 68 models
  • Zero helper function definitions (only make_model)
  • All numpy operations occur before with pm.Model()

Add PyMC translations for 68 Stan models that follow idiomatic PyMC
patterns: no Potentials, no custom helper functions, and all numpy
data wrangling occurs before the pm.Model context.

Models span: GLMs (wells, radon, kidscore, logearn variants),
GPs (gp_regr, gp_pois_regr), mixture models, educational models
(eight_schools, rats, diamonds), and ecological models (seeds, dogs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty nice, left some comments that apply more in general than the places I noted.

Big issues:

  • Lot's of dangling commets/logic for "normalization correction/matching Stan" that is not being done anymore
  • Way too many dumb comments in general, let the code speak unless it really seems pertinent
  • Some models have the prior_only but most don't? Is this a thing in the original posteriordb or the LLM just forgot about it?
  • Common to see the same expression defined once in the likelihood and then repeated inside a Deterministic. Not idiomatic at all

If we fix those, this is pretty much ready. Can you push a commit on top of this, instead of regenerating everything and force-pushing? Or even if you regenerate everything, don't force push

twiecki and others added 2 commits March 16, 2026 12:40
Address review comments on PR stan-dev#319:
- Remove unnecessary comments (let the code speak)
- Add prior_only parameter consistently to all 68 models
- Define Deterministics before likelihood, reference in observed
- Fix variable names to match semantics (e.g. logit_p not p)
- Remove dangling code from removed normalization corrections
- Net -464 lines (295 added for prior_only, 759 removed)

Applied via: transpailer refine --skill pymc_idiomaticity --in-place

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- accel_splines: remove rename-only Deterministics, move data before model
- log10earn_height: remove dangling N_obs variable
- low_dim_gauss_mix_collapse: use pm.NormalMixture instead of pm.Mixture
- normal_mixture: use pm.NormalMixture instead of pm.Mixture

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@twiecki
Copy link
Copy Markdown
Author

twiecki commented Mar 16, 2026

@ricardoV94 should be addressed.

Copy link
Copy Markdown

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@twiecki
Copy link
Copy Markdown
Author

twiecki commented Mar 19, 2026

@MansMeg this should be good to go.

@JTorgander
Copy link
Copy Markdown
Collaborator

JTorgander commented Mar 27, 2026

Big thanks for the contribution!

I’ll run the models through our internal test suite, which appears very similar to what you’ve been using. One initial observation:

Importing the Python modules (corresponding to the PyMC models) raises a NameError on the return type hint:
def make_model(...) -> pm.Model:

This happens because pm is not defined at function definition time. I suggest moving the module imports (e.g., import pymc as pm) to the module level rather than inside the make_model functions.

"references": "bales2019selecting",
"added_date": "2020-02-01",
"added_by": "Oliver Järnefelt",
"added_by": "Oliver J\u00e4rnefelt",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this change here and in subsequent files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants