Blind Redistricting

Geography-only congressional redistricting for all 50 US states using weighted ReCom MCMC.

This project draws congressional district maps using only geographic data — no partisan affiliation, no demographic data, no electoral history. The algorithm optimises for compactness, population equality, and county integrity. Partisan lean is computed after the fact as a post-hoc audit, never as an input.

What it does

For each state the pipeline:

Downloads TIGER 2020 census geography (VTD precincts, census blocks, counties, roads)
Builds a weighted precinct dual graph, where edge weights encode administrative-unit proximity (same county → 10×, same city → 5×, same township → 3×; major roads divide weights by 2); only shared borders ≥ 50 m are included to suppress spurious point-touch adjacencies
Samples an ensemble of valid district plans with Weighted ReCom MCMC (DeFord, Duchin & Solomon 2021)
Selects two Pareto-optimal maps from the ensemble
Renders choropleth maps with optional post-hoc 2020 presidential lean overlay (VEST data)

The pipeline covers all 44 multi-district states (435 seats; the 6 at-large states with K=1 are skipped).

Algorithm

Weighted ReCom MCMC

Each MCMC step:

Pick two adjacent districts at random
Merge their precincts into one region and build a random spanning tree, biased by edge weights
Cut the spanning tree to split the region into two near-equal-population halves
Accept the new partition (unconditionally — this is a uniform sampler over valid plans)

Edge weight formula:

w(e) = base × W_SAME_COUNTY^β × W_SAME_PLACE^β × W_SAME_COUSUB^β / W_ROAD^β

where β amplifies the contrast, making intra-county cuts far more likely than cross-county cuts. β is adaptive: β=0.5 for K=2 states (WV, ME, NH, ID, MT, RI, HI) and β=2.0 for all others — low-K states have sparse county structure and over-penalising county crossings produces degenerate maps.

Selection

From the ensemble, two maps are chosen via Pareto optimality across five objectives:

Objective	Direction
Polsby–Popper mean (compactness)	Maximise
County splits	Minimise
Cut edges	Minimise
Max districts per county (worst-case fragmentation)	Minimise
`cut_border_m` (total cross-county boundary length)	Minimise

A peninsula filter is applied before Pareto selection: any plan where the min or mean cut-border ratio is below 5% (indicating point-touch or thin-peninsula artefacts) is rejected.

Two representative plans are extracted:

Best compact — highest Polsby–Popper mean within the Pareto frontier
Fewest splits — minimum county fragmentation within the Pareto frontier

Hard filters (applied before Pareto)

Filter	Threshold
`pp_min` (worst district compactness)	≥ 0.10
`pop_dev_max` (worst district population deviation)	≤ 0.5%

Completed states

All 44 multi-district states are complete (435 redistricted seats). The 6 at-large states (AK, DE, ND, SD, VT, WY) have K=1 and are not redistricted.

State	K	Ensemble	pp_mean	State	K	Ensemble	pp_mean
Alabama	7	2,000	0.180	Missouri	8	2,000	0.222
Arizona	9	2,000	0.256	Montana	2	2,000	0.387
Arkansas	4	2,000	0.218	Nebraska	3	2,000	0.409
California	52	100,000	0.174	Nevada	4	2,000	0.402
Colorado	8	2,000	0.206	New Hampshire	2	2,000	0.357
Connecticut	5	2,000	0.343	New Jersey	12	2,000	0.197
Florida	28	100,000	0.205	New Mexico	3	2,000	0.280
Georgia	14	2,000	0.145	North Carolina	14	2,000	0.190
Hawaii	2	2,000	0.252	New York	26	100,000	0.188
Idaho	2	2,000	0.318	Ohio	15	3,750	0.166
Illinois	17	2,000	0.196	Oklahoma	5	2,000	0.214
Indiana	9	2,000	0.239	Oregon	6	2,000	0.260
Iowa	4	2,000	0.303	Pennsylvania	17	2,000	0.181
Kansas	4	2,000	0.370	Rhode Island	2	2,000	0.515
Kentucky	6	2,000	0.188	South Carolina	7	2,000	0.223
Louisiana	6	2,000	0.160	Tennessee	9	2,000	0.164
Maine	2	2,000	0.260	Texas	38	100,000	0.170
Maryland	8	2,000	0.195	Utah	4	2,000	0.339
Massachusetts	9	2,000	0.223	Virginia	11	2,750	0.195
Michigan	13	2,000	0.227	Washington	10	2,500	0.189
Minnesota	8	2,000	0.213	West Virginia	2	20,000	0.139
Mississippi	4	2,000	0.225	Wisconsin	8	2,000	0.224

Bold = enhanced step count. CA/FL/NY/TX used 100k steps; WV used 20k steps with adaptive β=0.5 (see STATE_NOTES.md).

Example maps

Maps are generated locally by running render_state.py after the pipeline completes for a state. Outputs land in data/{abbr}/final/:

map_compact.png — best Polsby–Popper mean
map_fewest_splits.png — fewest county splits
map_comparison.png — side-by-side with optional enacted-map column
map_comparison_lean.png — same, with VEST 2020 presidential lean badges

Repository layout

.
├── pipeline/
│   ├── download.py       # Download TIGER 2020 geography files
│   ├── build_graph.py    # Build weighted precinct dual graph
│   ├── sample.py         # Weighted ReCom MCMC sampler (checkpoint/resume)
│   ├── select.py         # Pareto-optimal map selection
│   └── lean.py           # Post-hoc partisan lean (VEST 2020)
│
├── run_all_states.py     # Parallel driver for all 50 states
├── render_state.py       # Render maps for any completed state
├── render_us_map.py      # Render national overview map
├── state_configs.py      # StateConfig dataclass + registry (all 50 states)
│
└── data/
    └── {abbr}/
        ├── graph/        # Precinct dual graph (.gpickle) + GeoPackage
        ├── ensemble/     # plans.parquet, metrics.parquet (chunked writes)
        └── final/        # Selected maps (.gpkg), PNGs, stats JSONs, report

Running the pipeline

Requirements

pip install -r requirements.txt

Key dependencies: geopandas, networkx, gerrychain >= 0.3.1, pyarrow, shapely, tqdm.

GerryChain note: PyPI may lag behind. If weighted spanning tree support is missing, install from source:
pip install git+https://github.com/mggg/GerryChain.git

Run a single state

# Full pipeline: download → graph → sample → select
python run_all_states.py --state MD

# Override MCMC step count (default scales with K)
python run_all_states.py --state TN --base-steps 5000

Run all states in parallel

# Uses all CPU cores by default
python run_all_states.py --workers 4

MCMC steps scale proportionally with the number of districts:

steps(state) = round(BASE_STEPS × K / 8)

where K is the number of congressional seats and 8 is Maryland's baseline. A state with 16 districts gets twice as many steps.

Render maps

# Render maps for one or more completed states
python render_state.py MD
python render_state.py MD TN GA

# Render the national overview
python render_us_map.py

Checkpoint and resume

The sampler writes a chunk every 1,000 steps and saves a checkpoint atomically. If a run is interrupted, simply re-run the same command — it resumes from the last checkpoint automatically:

python run_all_states.py --state CA --steps 100000
# Interrupted? Just re-run the same command.

Data sources

Data	Source	License
VTD precincts	US Census TIGER 2020	Public domain
Census blocks (population)	US Census TIGER 2020 (POP20)	Public domain
Counties, roads, places	US Census TIGER 2020	Public domain
Current congressional districts	US Census TIGER 2022 (CD118)	Public domain
Partisan lean (post-hoc only)	VEST 2020	CC BY 4.0

What this is not

Not a partisan gerrymander detector — the algorithm has no access to partisan data.
Not legally certified — maps are illustrative; real redistricting requires VRA compliance review, public comment, and legislative approval.
Not a claim about optimal maps — the Pareto frontier contains many equally valid maps; these two are representative, not unique.

The post-hoc lean overlay is provided for transparency and curiosity. It shows what the geography happened to produce, not what was aimed for.

References

DeFord, D., Duchin, M., & Solomon, J. (2021). Recombination: A family of Markov chains for redistricting. Harvard Data Science Review, 3(1).
Polsby, D. D., & Popper, R. D. (1991). The third criterion: Compactness as a procedural safeguard against partisan gerrymandering. Yale Law & Policy Review, 9(2), 301–353.
Metric Geometry and Gerrymandering Group. GerryChain. Python library for ReCom MCMC sampling.

License

MIT — see LICENSE.

Maps and statistical outputs are derived entirely from US Census public-domain data and may be freely used.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
cycles/2024		cycles/2024
docs		docs
figures		figures
pipeline		pipeline
.gitignore		.gitignore
ALGORITHM.md		ALGORITHM.md
CALIFORNIA_ANALYSIS.md		CALIFORNIA_ANALYSIS.md
CYCLES.md		CYCLES.md
LICENSE		LICENSE
README.md		README.md
STATE_NOTES.md		STATE_NOTES.md
archive_cycle.py		archive_cycle.py
build_paper_pdf.py		build_paper_pdf.py
compute_diversity.py		compute_diversity.py
compute_enacted_lean.py		compute_enacted_lean.py
compute_lean_2024.py		compute_lean_2024.py
generate_paper_figures.py		generate_paper_figures.py
paper_style.css		paper_style.css
render_all_state_comparisons.py		render_all_state_comparisons.py
render_maps.py		render_maps.py
render_margin_maps.py		render_margin_maps.py
render_partisan_spectrum.py		render_partisan_spectrum.py
render_state.py		render_state.py
render_us_map.py		render_us_map.py
requirements.txt		requirements.txt
research_paper.md		research_paper.md
research_paper_nature.md		research_paper_nature.md
reselect_md_baltimore.py		reselect_md_baltimore.py
run_all_states.py		run_all_states.py
state_configs.py		state_configs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blind Redistricting

What it does

Algorithm

Weighted ReCom MCMC

Selection

Hard filters (applied before Pareto)

Completed states

Example maps

Repository layout

Running the pipeline

Requirements

Run a single state

Run all states in parallel

Render maps

Checkpoint and resume

Data sources

What this is not

References

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Blind Redistricting

What it does

Algorithm

Weighted ReCom MCMC

Selection

Hard filters (applied before Pareto)

Completed states

Example maps

Repository layout

Running the pipeline

Requirements

Run a single state

Run all states in parallel

Render maps

Checkpoint and resume

Data sources

What this is not

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages