BibCrit v2.5

Free, open-access web tool for biblical textual criticism at bibcrit.com. Compare MT, LXX, and Dead Sea Scrolls; reconstruct Hebrew Vorlagen; profile scribal tendencies; detect theological revisions; track patristic citations; model numerical discrepancies; detect literary structures (chiasm, inclusios, parallel panels); identify documentary source layers (J/E/D/P); and visualize manuscript genealogies — all in a browser, in English and Spanish.

Screenshots

MT/LXX Divergence Analyzer	Back-Translation Workbench
Scribal Tendency Profiler	Numerical Discrepancy Modeler
DSS Bridge Tool	Theological Revision Detector
Patristic Citation Tracker	Manuscript Genealogy

Tools

#	Tool	Route	Description
1	MT/LXX Divergence Analyzer	`/divergence`	Word-level Hebrew/Greek comparison with alignment scoring. Claude classifies each divergence (`different_vorlage`, `theological_tendency`, `scribal_error`, etc.), assigns confidence, and generates competing scholarly hypotheses. Exports SBL footnotes, BibTeX, RIS (Zotero), and TEI XML. Prompt: `divergence_v2`.
2	Scribal Tendencies Profiler	`/scribal`	Statistical fingerprint of an LXX book's translator across five dimensions: literalness, anthropomorphism reduction, messianic heightening, harmonization, and paraphrase rate. Rendered as a D3.js radar chart with per-dimension evidence. Supports two-book comparison. Prompt: `scribal_v1`.
3	Numerical Discrepancies	`/numerical`	Surfaces numerical divergences (patriarchal ages, census figures, temple dimensions, etc.) across MT, LXX, and Samaritan Pentateuch, ranking competing theories by confidence. Prompt: `numerical_v3`.
4	Ancient Witness Bridge (DSS)	`/dss`	Compare a passage across five ancient witnesses: Dead Sea Scrolls (1QIsaᵃ and others), Samaritan Pentateuch, Peshitta (Syriac OT), MT, and LXX. Shows which witnesses attest the passage, alignment, and specific divergences. Prompt: `dss_v6`.
5	Theological Revisions	`/theological`	Identifies theologically motivated textual changes — anthropomorphism avoidance, messianic heightening, polemical alterations, harmonization. Prompt: `theological_v1`.
6	Patristic Citation Tracker	`/patristic`	Traces Church Father citations (1st–5th century), identifies the text form used, and visualizes text-form distribution as a bar chart. Each citation links to BiblIndex for primary source access. Prompt: `patristic_v3`.
7	Back-Translation Workbench	`/backtranslation`	Reconstructs the probable Hebrew Vorlage word-by-word from LXX Greek using Tov's retroversion methodology, with confidence levels and summary assessments. Prompt: `backtranslation_v1`.
8	Manuscript Genealogy	`/genealogy`	Visualizes the full transmission stemma of a biblical book — from proto-text through manuscript families (MT, LXX, DSS, SP, Peshitta, Targum, Vulgate) to modern critical editions. Prompt: `genealogy_v1`.
9	NT Use of OT Analyzer	`/nt-ot`	Enter a New Testament passage and identify every OT allusion it contains. For each allusion, determines whether the NT author cited MT, LXX, an independent form, or a conflation — applying the methodology of Beale & Carson, Stanley, and Hays. Prompt: `nt_ot_v1`.
10	Chiasm & Literary Structure Detector	`/chiasm`	Detects concentric literary structures (A-B-C-B′-A′), parallel panels, inclusios, and refrains. Maps each structural element with its mirror partner and identifies the focal turning point. Methodology: Lund, Welch, Dorsey, Walsh. Prompt: `chiasm_v1`.
11	Source Criticism Tool	`/source`	Assigns documentary source designations (J, E, D, P, Redactor) to Pentateuchal units using classical criteria: divine name usage (YHWH vs. Elohim), vocabulary patterns, doublets, and narrative tensions. Scholarly grounding: Wellhausen, Friedman, Baden. Prompt: `source_v1`.

Data Sources

Corpus	Source	Path
MT	ETCBC/BHSA via Text-Fabric	`data/corpora/mt_etcbc/`
LXX	Rahlfs (ingested via `ingest_lxx_rahlfs.py`)	`data/corpora/lxx_stepbible/`
DSS	ETCBC/DSS via Text-Fabric — 1QIsaᵃ, 4QSamᵃ, 11QPaleoLev, 4QDeutn	`data/corpora/dss/`
SP	dt-ucph/sp via Text-Fabric	`data/corpora/sp_etcbc/`
GNT	SBLGNT	`data/corpora/gnt_opengnt/`
PESH	ETCBC/peshitta via Text-Fabric (SEDRA / Beth Mardutho) — 39 OT books, 308,863 words	`data/corpora/pesh_etcbc/`

License note: The ETCBC corpora (MT/BHSA, DSS, and Peshitta) are released under CC-BY-NC 4.0. The app code is Apache 2.0; the corpus data it ingests retains its own license terms. Do not use the ingested ETCBC data for commercial purposes without a separate agreement with ETCBC.

Tech Stack

Layer	Technology
Web framework	Flask 3.0+
AI analysis	Anthropic Python SDK 0.30+ · model: `claude-sonnet-4-5-20250929`
Visualization	D3.js v7 (radar charts, bar charts)
Persistence	Supabase (PostgreSQL) + disk JSON fallback
Production server	Gunicorn (1 worker, 2 threads)
Fonts	Space Grotesk, Noto Sans Hebrew, Noto Serif
Deploy target	Render (Python 3.11, `render.yaml` included)

Architecture

BibCrit/
├── app.py                      # Flask app factory; lazy _init() wires corpus + pipeline
├── state.py                    # Shared singletons: corpus, pipeline, i18n, TranslationProxy
├── requirements.txt
├── render.yaml                 # One-click Render deploy config
│
├── blueprints/
│   ├── textual.py              # /divergence, /backtranslation, /dss, /genealogy + APIs
│   ├── critical.py             # /scribal, /numerical, /theological, /patristic + APIs
│   ├── literary.py             # /chiasm, /source + SSE APIs
│   ├── discovery.py            # /discovery, /api/discovery/cards, /api/admin/discovery/flag
│   └── research.py             # /health, /guide
│
├── biblical_core/
│   ├── claude_pipeline.py      # ClaudePipeline: Claude calls, Supabase cache, budget tracking
│   ├── corpus.py               # BiblicalCorpus: loads MT, LXX, DSS, SP, GNT
│   ├── ref_utils.py            # Reference string parsing; per-tool verse-count limits
│   └── divergence.py           # parse_claude_response, format_sbl_footnote, format_bibtex
│
├── data/
│   ├── i18n.json               # All UI strings (en, es)
│   ├── prompts/                # Versioned prompt templates ({tool}_{version}.txt)
│   ├── cache/                  # Disk-based analysis cache fallback ({sha256}.json)
│   └── corpora/
│       ├── mt_etcbc/           # Masoretic Text (ETCBC/BHSA morphology)
│       ├── lxx_stepbible/      # Septuagint (Rahlfs)
│       ├── dss/                # Dead Sea Scrolls (ETCBC, primarily 1QIsaᵃ)
│       ├── sp_etcbc/           # Samaritan Pentateuch (dt-ucph/sp via ETCBC)
│       └── gnt_opengnt/        # Greek New Testament (SBLGNT)
│
├── scripts/
│   ├── precache_all.py         # Seed 91 featured passages in English
│   ├── precache_es.py          # Translate all 91 passages to Spanish
│   ├── push_cache_to_supabase.py  # Push disk cache → Supabase
│   ├── ingest_mt.py            # ETCBC/BHSA → CSV
│   ├── ingest_lxx.py           # LXX (STEP) → CSV
│   ├── ingest_lxx_rahlfs.py    # LXX (Rahlfs) → CSV
│   ├── ingest_dss_1qisaa.py    # ETCBC/DSS (1QIsaᵃ) → CSV
│   ├── ingest_sp.py            # SP (dt-ucph/sp) → CSV
│   └── ingest_gnt.py           # SBLGNT → CSV
│
├── templates/                  # Jinja2 templates extending base.html
└── static/                     # CSS (bibcrit.css, style.css), JS per-tool, SVG assets

Key design decisions

state.py holds no blueprint or app imports, preventing circular dependencies. Blueprints read state.corpus and state.pipeline directly.
app._init() runs before the first request (thread-safe double-checked locking) to keep startup fast.
SSE (Server-Sent Events) streams real-time progress steps (step / done / error frames) to the browser while Claude analyzes.
Budget checks happen before every API call; once spend_usd >= cap_usd the endpoint returns an error frame without calling the API.

Cache system

Every analysis result is keyed by:

cache_key = SHA256("{reference}|{tool}|{prompt_version}|{model}")

English (EN): written to both Supabase (analysis_cache table) and disk (data/cache/{sha256}.json). Supabase is the primary read path; disk is the fallback.

Spanish (ES): stored in Supabase only (analysis_cache_es table). Generated by scripts/precache_es.py, which translates cached EN analyses rather than re-running the full Claude pipeline.

Getting Started

1. Clone

git clone https://github.com/Jossifresben/bibcrit.git
cd bibcrit

2. Install dependencies

python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -r requirements.txt

3. Create `.env`

cp .env.example .env           # or create from scratch — see table below

4. Run

flask run

The app is available at http://127.0.0.1:5000.

For production-like local testing:

gunicorn app:app --workers 1 --threads 2 --timeout 120

Environment Variables

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	Yes	—	Anthropic API key. Without it the analysis tools return a graceful error; all cached results and the corpus browser still work.
`BIBCRIT_API_CAP_USD`	No	`10.0`	Monthly Claude spend cap in USD. Resets each calendar month.
`SUPABASE_URL`	No	—	Supabase project URL. If unset, caching and budget tracking fall back to disk (`data/cache/`).
`SUPABASE_KEY`	No	—	Supabase `anon` or `service_role` key.
`BIBCRIT_ADMIN_KEY`	No	—	Arbitrary secret for `POST /api/admin/discovery/flag`. Without it the endpoint returns 403.

Corpus Ingestion

Run ingestion scripts once to populate data/corpora/. Each script pulls from Text-Fabric or a local file and writes normalized CSV:

python scripts/ingest_mt.py
python scripts/ingest_lxx_rahlfs.py
python scripts/ingest_dss_1qisaa.py
python scripts/ingest_sp.py
python scripts/ingest_gnt.py

Text-Fabric downloads corpora on first run (~several hundred MB). The ETCBC and SP corpora require acceptance of their respective licenses before use.

Pre-caching Featured Passages

The repo ships with analyses for featured passages across all 9 tools. To seed or refresh:

# Seed all missing EN analyses (safe to re-run; skips already-cached)
python scripts/precache_all.py

# Seed a specific tool only
python scripts/precache_all.py --type numerical

# Dry run — show what would be seeded without calling the API
python scripts/precache_all.py --dry-run

# Push disk cache to Supabase
python scripts/push_cache_to_supabase.py

# Generate Spanish translations of all cached EN analyses
python scripts/precache_es.py

URL Routes

Pages

Method	Route	Description
GET	`/`	Home / landing page
GET	`/divergence`	MT/LXX Divergence Analyzer
GET	`/backtranslation`	Back-Translation Workbench
GET	`/scribal`	Scribal Tendency Profiler
GET	`/numerical`	Numerical Discrepancy Modeler
GET	`/dss`	Ancient Witness Bridge (DSS)
GET	`/theological`	Theological Revision Detector
GET	`/patristic`	Patristic Citation Tracker
GET	`/genealogy`	Manuscript Genealogy
GET	`/discovery`	Discovery — plain-language findings
GET	`/guide`	User guide
GET	`/health`	Health check (`{"status": "ok"}`)

Analysis API (SSE streaming)

All stream endpoints emit step (progress), done (full JSON result), and error frames.

Method	Route	Key query param
GET	`/api/divergence/stream`	`ref` — e.g. `Isaiah 7:14`
GET	`/api/backtranslation/stream`	`ref`
GET	`/api/scribal/stream`	`book` — e.g. `Isaiah`
GET	`/api/numerical/stream`	`ref` — e.g. `Genesis 5`
GET	`/api/dss/stream`	`ref`
GET	`/api/theological/stream`	`ref`
GET	`/api/patristic/stream`	`ref`
GET	`/api/genealogy/stream`	`ref`

Open Data API

BibCrit's analysis corpus is publicly readable:

GET /api/cache
GET /api/cache?tool=divergence
GET /api/cache?tool=theological&ref=Isaiah+7:14
GET /api/cache?discovery_ready=true&limit=50&offset=0

Param	Description	Default
`tool`	Filter by tool (`divergence`, `backtranslation`, `scribal`, `numerical`, `dss`, `theological`, `patristic`, `genealogy`, `nt_ot`, `chiasm`, `source`)	all
`ref`	Case-insensitive substring match on reference	all
`discovery_ready`	`true` / `false`	all
`limit`	Max records per page (max 200)	50
`offset`	Pagination offset	0

All data is released under Apache 2.0. If you use BibCrit analyses in research, please cite:

Fresco Benaim, J. (2026). BibCrit: AI-assisted biblical textual criticism. ORCID: 0009-0000-2026-0836

Corpus Browser API

Method	Route	Query params
GET	`/api/books`	`tradition=MT\|LXX`
GET	`/api/chapters`	`book`, `tradition`
GET	`/api/verses`	`book`, `chapter`, `tradition`

Export API

Method	Route	Query params	Returns
GET	`/api/divergence/export/sbl`	`ref`	SBL-style footnote string per divergence
GET	`/api/divergence/export/bibtex`	`ref`	BibTeX `@misc` entries
GET	`/api/divergence/export/ris`	`ref`	RIS records (Zotero / Mendeley import)
GET	`/api/divergence/export/tei`	`ref`	TEI XML critical apparatus (`<listApp>`)
GET	`/api/scribal/export/sbl`	`book`	SBL footnote for scribal profile

Discovery API

Method	Route	Notes
GET	`/api/discovery/cards`	`offset`, `limit` (max 50)
GET	`/api/budget`	Current spend vs. cap
POST	`/api/admin/discovery/flag`	`ref`, `ready=true\|false`, `key` (admin only)

Hypothesis Voting

Method	Route	Query params
GET	`/api/hypothesis/votes`	`ref`
POST	`/api/hypothesis/vote`	`ref`, `direction=up\|down`, `action=cast\|retract`

Internationalization

UI strings live in data/i18n.json. The lang query parameter (?lang=es) selects the active language. state.TranslationProxy (exposed as _t() in templates) falls back to English if a key is missing.

Language	Code	Analysis cache	Status
English	`en`	`analysis_cache` (Supabase) + disk	Available
Spanish	`es`	`analysis_cache_es` (Supabase only)	Available
Hebrew	`he`	—	Planned (RTL wiring in `base.html`)
Dutch	`nl`	—	Planned

To add a language: add a top-level key to data/i18n.json matching all en keys, and add a button to the language picker in templates/base.html.

Scholar / Student Mode

A toggle in the navbar (book icon) switches between Scholar mode (default — full technical analysis) and Student mode (plain-language explanations highlighted, technical text hidden). The preference is persisted in localStorage under bibcrit-mode. No server changes are needed; the mode is purely client-side CSS (body.mode-student .technical-only { display: none }).

Supabase Schema

-- English analysis results cache
CREATE TABLE analysis_cache (
  cache_key       TEXT PRIMARY KEY,
  reference       TEXT,
  tool            TEXT,
  prompt_version  TEXT,
  model_version   TEXT,
  data            JSONB,
  cached_at       TIMESTAMPTZ,
  discovery_ready BOOLEAN DEFAULT FALSE
);

-- Spanish analysis results cache
CREATE TABLE analysis_cache_es (
  cache_key       TEXT PRIMARY KEY,
  reference       TEXT,
  tool            TEXT,
  prompt_version  TEXT,
  model_version   TEXT,
  data            JSONB,
  cached_at       TIMESTAMPTZ
);

-- Monthly API spend tracking
CREATE TABLE budget (
  month       TEXT PRIMARY KEY,   -- e.g. '2026-03'
  spend_usd   NUMERIC,
  cap_usd     NUMERIC,
  updated_at  TIMESTAMPTZ
);

-- Hypothesis voting
CREATE TABLE hypothesis_votes (
  reference   TEXT PRIMARY KEY,
  upvotes     INTEGER DEFAULT 0,
  downvotes   INTEGER DEFAULT 0,
  updated_at  TIMESTAMPTZ
);

All tables are optional — the app falls back to disk if Supabase is unavailable (EN only; ES requires Supabase).

Deploy to Render

A render.yaml is included. To deploy:

Push the repo to GitHub.
In the Render dashboard, click New > Blueprint and point it at the repo.
Set ANTHROPIC_API_KEY (and optionally SUPABASE_URL / SUPABASE_KEY / BIBCRIT_ADMIN_KEY) as environment variables.
Deploy.

The default spend cap is $10.00/month. Raise it via BIBCRIT_API_CAP_USD in the Render environment variables.

Roadmap

✅ Completed (v2.5)

🔜 Phase 1 — Months 1–2: Foundation

Corpus

Peshitta real corpus — ETCBC/peshitta via Text-Fabric; 39 OT books, 308,863 Syriac word tokens in pesh_etcbc/
MT/LXX expansion — all 39 MT books and 38 LXX books already present (complete)
Extended DSS witnesses — 4QSamᵃ (4Q51), 11QPaleoLev (11Q1), 4QDeutn (4Q41) added; note: 1QpHab excluded (ETCBC DSS has no MT-aligned verse coordinates for this scroll)

🔜 Phase 2 — Months 3–4: New Traditions

Corpus

Targum corpus (Onkelos + Jonathan) — Aramaic Targumim from CAL / ETCBC; register targum_cal/ tradition
Vulgate corpus (Jerome) — Latin OT + NT from CLTK / Open Scriptures; register vulgate_cltk/ tradition
LXX variant MSS — add Sinaiticus and Alexandrinus alongside Vaticanus; unlocks three-way LXX manuscript comparison

New tools

Targum Comparator (/targum) — MT vs. Targum word-level comparison; expansion types: theological, halakhic, messianic, divine-name substitution (Memra/Shekhina), haggadic; Sperber / McNamara methodology
NT Textual Tradition Analyzer (/nt-text) — classify NT variants across Byzantine, Alexandrian, and Western text types; UBS/NA apparatus data + AI analysis; Metzger methodology

Infrastructure

Hebrew RTL UI (he locale) — full RTL layout; makes BibCrit usable by Israeli biblical scholars
True Anthropic token streaming — replace blocking messages.create() with messages.stream(); sections appear as Claude writes them, 10–20s earlier on first queries

🔜 Phase 3 — Months 5–6: Synthesis

Corpus

Second Temple literature — 1 Enoch, Jubilees, Sirach, 4 Ezra, Tobit from CLTK / Open Scriptures; register stl_cltk/ tradition
Peshitta NT — Syriac NT (Aramaic Primacy tradition); third NT tradition alongside SBLGNT

New tools (capstone — require all prior corpora)

Second Temple Literature Bridge (/stl) — map allusions from 1 Enoch, Jubilees, Sirach, 4 Ezra to canonical texts; critical for DSS and NT intertextuality; Nickelsburg / VanderKam / Collins methodology
Intertextuality Mapper (/intertextuality) — full allusion network for any passage: inner-biblical allusions, NT echoes, patristic citations, DSS parallels, Second Temple parallels; exportable as JSON-LD / RDF; Hays / Beale / Fishbane methodology

Infrastructure

Full open API v1 — versioned endpoints, API key management, rate limiting, Swagger docs at /api/docs
Dutch UI (nl) and Portuguese UI (pt)
JOSS paper v3 + Zenodo DOI update — reflect 15 tools and 9 corpus traditions

v3.0 target state

Metric	v2.2 now	v3.0 (+6 months)
Analysis tools	11	15
Corpus traditions	5	9
UI languages	2 (EN, ES)	5 (+ HE, NL, PT)
First-in-world open tools	5	11

License

Apache 2.0 — see LICENSE.

Corpus data retains its own licenses: ETCBC/BHSA and ETCBC/DSS are CC-BY-NC 4.0; SP (dt-ucph/sp) and SBLGNT have their own terms. See each upstream repository for details.

Credits

Built by Jossi Fresco. Analysis powered by Claude (Anthropic). Corpus data: ETCBC (MT, DSS), dt-ucph/sp (Samaritan Pentateuch), Rahlfs (LXX), SBLGNT (GNT).

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
aramaic_core		aramaic_core
biblical_core		biblical_core
blueprints		blueprints
data		data
docs		docs
scripts		scripts
static		static
templates		templates
tests		tests
video		video
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
paper.bib		paper.bib
paper.md		paper.md
pyproject.toml		pyproject.toml
render.yaml		render.yaml
requirements.txt		requirements.txt
state.py		state.py

Folders and files

Latest commit

History

Repository files navigation

BibCrit v2.5

Screenshots

Tools

Data Sources

Tech Stack

Architecture

Key design decisions

Cache system

Getting Started

1. Clone

2. Install dependencies

3. Create .env

4. Run

Environment Variables

Corpus Ingestion

Pre-caching Featured Passages

URL Routes

Pages

Analysis API (SSE streaming)

Open Data API

Corpus Browser API

Export API

Discovery API

Hypothesis Voting

Internationalization

Scholar / Student Mode

Supabase Schema

Deploy to Render

Roadmap

✅ Completed (v2.5)

🔜 Phase 1 — Months 1–2: Foundation

🔜 Phase 2 — Months 3–4: New Traditions

🔜 Phase 3 — Months 5–6: Synthesis

v3.0 target state

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. Create `.env`

Packages