Add PyMC v5 model implementations for 108/120 posteriors#318
Add PyMC v5 model implementations for 108/120 posteriors#318twiecki wants to merge 79 commits intostan-dev:masterfrom
Conversation
…oled Transpiled from Stan using pymc-rust-ai-compiler's Stan→PyMC transpiler. All models validated against BridgeStan reference logp values. https://claude.ai/code/session_012idBhKFGF4Ju757RqTpMcD
Add PyMC v5 transpiled models (blr, earn_height, wells_dist, radon_pooled)
|
|
||
| zgp_sigma_1 = pm.Normal("zgp_sigma_1", mu=0, sigma=1, shape=NBgp_sigma_1) | ||
|
|
||
| # Custom potentials to match Stan's exact truncated Student-t implementation |
| import numpy as np | ||
|
|
||
| with pm.Model() as model: | ||
| # Extract data |
There was a problem hiding this comment.
In general I like the first model writing approach better, do the data wrangling outside of pm.Model() so I can skip to it faster
| errs, _ = pytensor.scan( | ||
| fn=step, | ||
| sequences=[y_tensor[1:], y_tensor[:-1]], | ||
| outputs_info=[err_0], | ||
| non_sequences=[mu, phi, theta], | ||
| ) | ||
| err = pt.concatenate([pt.atleast_1d(err_0), errs]) | ||
|
|
||
| # Likelihood: err ~ normal(0, sigma) using Potential | ||
| log_likelihood = pt.sum(pm.logp(pm.Normal.dist(mu=0, sigma=sigma), err)) | ||
| pm.Potential("likelihood", log_likelihood) |
There was a problem hiding this comment.
Did it try this approach? https://www.pymc.io/projects/examples/en/latest/time_series/Time_Series_Generative_Graph.html
It can definitely do ar and ma, so arma should be fine? Older POC: https://gist.github.com/ricardoV94/a49b2cc1cf0f32a5f6dc31d6856ccb63#file-pymc_timeseries_ma-ipynb
| # Parameters - use Flat priors and add manual normal log prob to match Stan exactly | ||
| theta = pm.Flat("theta", shape=nChild) | ||
|
|
||
| # Add manual prior to match Stan's normal(0, 36) exactly |
|
Too eager to jump to pm.Potential |
|
The discrete marginalization /HMMM would be great test cases for missing functionality in |
Yes, that's actually running right now. |
oh, you mean currently marginalize can't solve those? |
Dunno. It's restricted to what graphs it allow marginalization. I didn't take a look, I assumed you excluded all these cases, reading the top message. |
Adds run_compile_to_rust.py batch script that uses the transpailer agentic loop (Claude + cargo build + logp validation) to compile PyMC models to optimized Rust logp+gradient implementations. Successfully compiled models: blr, diamonds, kidscore_interaction, kidscore_interaction_c, kidscore_interaction_c2, kidscore_interaction_z. Each model includes generated.rs and optimization trace (results.tsv). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add PyMC-to-Rust compilation pipeline (6/104 models)
|
This is very good news. We have been working to find a way to start to add the models. I think we so far has been adding PyMC models and testing them. @JTorgander, what do you think about this PR? Is there a way to accept them in bulk using our testing process? |
Hand-ported Stan models using Stan ILR+softmax simplex transform (Helmert sub-matrix basis) with correct Jacobians. All 4 models pass gradient validation against BridgeStan at rtol=1e-5, atol=1e-6. Models added: - ldaK2: Latent Dirichlet Allocation (K=2 topics) - ldaK5: Latent Dirichlet Allocation (K=5 topics) - hmm_drive_1: Hidden Markov Model with bivariate emissions - hierarchical_gp: Hierarchical Gaussian Process with variance decomposition Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address reviewer feedback on PR stan-dev#318: - Replace pt.dot()/pm.math.dot() with @ operator (16 files) - Remove constant correction Potentials that don't affect sampling (30 files) - Remove unnecessary shape=1 special cases in accel_splines - Replace Flat+Potential prior pattern with pm.Normal in bones_model These changes make the transpiled models more idiomatic PyMC while preserving gradient equivalence with the original Stan models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hand-ported Stan models using Stan ILR+softmax simplex transform (Helmert sub-matrix basis) with correct Jacobians. All 4 models pass gradient validation against BridgeStan at rtol=1e-5, atol=1e-6. Models added: - ldaK2: Latent Dirichlet Allocation (K=2 topics) - ldaK5: Latent Dirichlet Allocation (K=5 topics) - hmm_drive_1: Hidden Markov Model with bivariate emissions - hierarchical_gp: Hierarchical Gaussian Process with variance decomposition Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Porting guide for remaining 16 Stan-to-PyMC models
|
@twiecki can you ask it to select the models that are idiomatic (no weird helper functions, no potentials) and split those into a separate PR? Also do ask it do all the numpy data wrangling before entering |
Add 4 PyMC models with simplex parameters
Add initvals to 4 models with initialization issues (Flat/HalfFlat priors, ordered transforms, high-dimensional latent params). Priors unchanged so logp tests remain valid. Transpile hmm_drive_1 (forward algorithm HMM). - accel_splines: initval on Flat spline coefficients + Truncated sds - low_dim_gauss_mix: initval=[-1, 1] for ordered mu - lsat_model: initval=zeros for 1000 latent thetas - kidscore_mom_work: initval for Flat beta + HalfFlat sigma - hmm_drive_1: new transpilation with forward algorithm as Potential Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix sampling failures in 5 transpiled PyMC models
ricardoV94
left a comment
There was a problem hiding this comment.
some code smell in some of those initval. Ordered usually needs it, but others are already zero by default. I assume your bot is not using model.initial_point but trying to read model.rvs_to_initial_point (or whatever is called directly)
Tested each model without initvals to find the minimum set: - accel_splines: only Truncated sds needs initval (Flat defaults to 0) - low_dim_gauss_mix: only ordered mu needs initval - hmm_drive_1: only ordered phi/lambda need initval - kidscore_mom_work: no initvals needed (samples fine with defaults) - lsat_model: no initvals needed (samples fine with defaults) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove redundant initvals, keep only where needed
|
Should be set. |
Benchmark PyMC (nutpie/numba) vs Stan (cmdstan) on all posteriordb models. Results: PyMC faster on 52/101 models, geometric mean 1.30x speedup. Includes benchmark script, per-model results, and visualization plots. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch primary metric from raw sampling time to total wall-clock time (compile + sample) per effective sample. PyMC wins 85/101 models (1.90x geo mean) on this end-to-end efficiency metric. Add separate plots for sec/ESS sampling-only, sec/ESS total, raw time, and total time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace pm.Flat + pm.Potential anti-pattern with proper distributions in surgical_model, logmesquite_logva, logmesquite_logvas, and arma11. surgical_model now converges (was Rhat=4.03 with 3615 divergences). Updated results: PyMC wins 87% of models on total sec/ESS (was 84%), geometric mean advantage 2.04x (was 1.90x). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
This PR adds PyMC v5 implementations for 104 out of 120 Stan models in posteriordb, along with comprehensive test infrastructure to verify correctness.
Every transpiled model has been validated for gradient equivalence against the original Stan implementation via BridgeStan. Gradients are invariant to normalization constants, making them a stricter correctness check than log-density comparisons alone. Additionally, log-density values are verified to match up to a constant offset (accounting for Stan's convention of dropping normalizing constants).
All models follow idiomatic PyMC v5 patterns:
pytensor.scanfor true sequential dependencies (GARCH, state-space, HMMs)pm.HalfNormal,pm.HalfCauchyetc. for lower-bounded parametersmake_model(data: dict) -> pm.ModelinterfaceModel coverage
blr,earn_height,logistic_regression_rhs(regularized horseshoe)radon_*,election88_full,eight_schools_*garch11,arK,prophet2pl_latent_reg_irt,hier_2pl,grsm_latent_reg_irt,irt_2plhmm_example,hmm_drive_0(forward algorithm)Mh_model,Mth_model,Mtbh_model,Rate_*lotka_volterra,one_comp_mm_elim_absnormal_mixture,low_dim_gauss_mixRemaining 16 models
These require specialized handling beyond the current transpiler capabilities:
hmm_gaussian,hmm_drive_1,iohmm_reg)hierarchical_gp,kronecker_gp) — needpm.gpAPIldaK2,ldaK5) — discrete marginalizationsir,covid19imperial_v2/v3)nn_rbm1bJ10,nn_rbm1bJ100)Test infrastructure
Two test suites are included:
test_transpiled_models.py— Compares log-density values between PyMC and BridgeStan at multiple parameter points. Allows for constant offsets from normalization conventions.tests/test_pymc_gradients.py— Compares gradients of the log-density, which are invariant to additive constants. This is the primary correctness verification since identical gradients guarantee identical posterior geometry.Both test suites auto-discover all models that have both a Stan and PyMC implementation, so new models are automatically tested.
Review
I realize this is a lot of code to review at once. I'm happy to:
Let me know what works best.
Review checklist (104 models)
2pl_latent_reg_irtaccel_gpaccel_splinesarKarma11blrbones_modelbym2_offset_onlydiamondsdogsdogs_hierarchicaldugongs_modelearn_heighteight_schools_centeredeight_schools_noncenteredelection88_fullgarch11GLM_Binomial_modelGLM_Poisson_modelGLMM_Poisson_modelGLMM1_modelgp_pois_regrgp_regrgrsm_latent_reg_irthier_2plhmm_drive_0hmm_exampleirt_2plkidscore_interactionkidscore_interaction_ckidscore_interaction_c2kidscore_interaction_zkidscore_mom_workkidscore_momhskidscore_momhsiqkidscore_momiqkilpisjarvilog10earn_heightlogearn_heightlogearn_height_malelogearn_interactionlogearn_interaction_zlogearn_logheight_malelogistic_regression_rhslogmesquitelogmesquite_logvalogmesquite_logvaslogmesquite_logvashlogmesquite_logvolumelosscurve_sisloblotka_volterralow_dim_gauss_mixlow_dim_gauss_mix_collapselsat_modelM0_modelMb_modelmesquiteMh_modelMt_modelMtbh_modelMth_modelmulti_occupancynesnes_logit_modelnormal_mixturenormal_mixture_kone_comp_mm_elim_abspilotsprophetradon_countyradon_county_interceptradon_hierarchical_intercept_centeredradon_hierarchical_intercept_noncenteredradon_partially_pooled_centeredradon_partially_pooled_noncenteredradon_pooledradon_variable_intercept_centeredradon_variable_intercept_noncenteredradon_variable_intercept_slope_centeredradon_variable_intercept_slope_noncenteredradon_variable_slope_centeredradon_variable_slope_noncenteredRate_1_modelRate_2_modelRate_3_modelRate_4_modelRate_5_modelrats_modelseeds_centered_modelseeds_modelseeds_stanified_modelsesame_one_pred_astate_space_stochastic_level_stochastic_seasonalsurgical_modelSurvey_modelwells_daae_c_modelwells_dae_c_modelwells_dae_inter_modelwells_dae_modelwells_distwells_dist100_modelwells_dist100ars_modelwells_interaction_c_modelwells_interaction_model🤖 Generated with Claude Code