Lacuna

Women's health M&A diligence stack — verified deals, genomics governance, cited analytics

Curated dataset · n=58 verified deals · Not live market data · Scores are descriptive, not forecasts.

The live app reads src/data/dataset.verified.json by default. Server-side LLM calls use Vercel AI Gateway only. TensorFlow code is quarantined (not in the app). See MODEL_CARD.md before citing any score.

Overview

Lacuna is an investment research stack — a diligence infrastructure prototype with a curated, source-linked snapshot of women's health M&A (58 verified deals), rendered as D3 network views and descriptive analytics with published methodology.

Claim	Reality
Deal data	Static `dataset.verified.json` (manual verification from SEC, press, filings)
Scores & "predictors"	Deterministic rules and small-n statistics — MODEL_CARD.md
"ML" / TensorFlow	Quarantined under `src/lib/ml/_quarantine/` — not imported by the app
Server LLM	INFERENCE.md — Vercel AI Gateway (+ OpenAI fallback for local dev)
Clinical trials panel	Live ClinicalTrials.gov search; M&A panels still use the curated dataset
Production intelligence	No — not PitchBook, not a data SLA, not investment advice

Open source under BSL 1.1 for corp VC diligence workflows, portfolio review, and self-hosted exploration. Commercial competitive products need a separate license — mps5cy@virginia.edu.

Portfolio project by Mae Kass.

Deployment: The analytics product runs on Vercel (this repo). A separate Framer site is for brand and narrative only, with one primary CTA into the live demo — see SITE_ARCHITECTURE.md and the framer/ build kit.

What is Lacuna?

An open-source diligence prototype for corporate VC and healthcare investors exploring verified women's health / FemTech M&A:

D3.js force-directed acquirer–target graphs
Deal flow & valuation charts (curated counts and disclosed values)
Descriptive scoring (factor weights, cosine similarity, k-means — no trained forecast models in the UI)
ClinicalTrials.gov lookup (live API; separate from deal JSON)
Health-equity context with cited disparity statistics (descriptive, not allocation advice)

Every analytical panel in the app shows the provenance line above.

Live Demo

lacuna-maekass.vercel.app

Resource	Link
Application	lacuna-maekass.vercel.app
Repository	github.com/maekass/Lacuna
Methodology	docs/MODEL_CARD.md
License	BSL 1.1 → Apache 2.0 May 2030

Core Features

Verified deal explorer

58 verified acquisitions (fertility, oncology, diagnostics, menopause, pelvic health, precision medicine)
Acquirers include Hologic, KKR, Pfizer, Gilead, Boston Scientific, and others named in sources
Dataset v5 · updated per provenance.lastUpdated in JSON
Sources: SEC EDGAR, press releases, investor relations (see DATA_CURATION_CHECKLIST.md)

Interactive network (`ForceNetwork.tsx`)

D3 force-directed graph: sector colors, deal-type edges, valuation-scaled nodes. Methodology: NETWORK_ANALYSIS_METHODOLOGY.md.

Deal flow (`DealFlowChart.tsx`)

Year-over-year counts from verified announcedDate — animated bars, no synthetic deal generator.

Valuation matrix (`ValuationMatrix.tsx`)

Sector × stage heatmap using disclosed values only; cells show company counts and averages.

Exit-likelihood leaderboard (`QuantValuationPanel.tsx`)

New: Heuristic valuation and exit-likelihood section with:

ValuationEngine — bounded comparable multiples (EV/Revenue, EV/EBITDA) with uncertainty disclosures
AcquisitionPredictor — sector-stage acquisition probability estimates (15/75 coverage noted)
HealthImpactModeler — lives-saved modeling with Cohen's d bounds (not a rate)
PortfolioOptimizer — stage-varying risk-adjusted ROI optimizer
Verified-fields-only adapter — adaptQuantCompany uses only validated dataset fields; absent inputs remain undefined per provenance rules

See MODEL_CARD.md for methodology and caveats.

Descriptive analytics (heuristics, not predictive ML)

Curated dataset · n=58 verified deals · Not live market data · Scores are descriptive, not forecasts.

Acquisition likelihood indicators (`ExitPredictor.tsx`)

Transparent factor scoring for non-acquired companies in the verified set. Fixed weights, full disclosure in UI and MODEL_CARD.md. Not a predictive model; no TensorFlow.

Company similarity (`CompanySimilarity.tsx`)

8-D feature vectors, inline cosine similarity — "companies like this" for exploration.

Clustering (`ClusteringAnalysis.tsx`)

k-means on valuation × employees — descriptive segments (Emerging / Growth / Late-stage labels).

Optional server narratives (INFERENCE.md)

UI blurbs via POST /api/ai/insights → Vercel AI Gateway (anthropic/claude-sonnet-4 slug).
Exploratory copy only — heuristic scores on the curated dataset remain authoritative.

Health Equity & Black Women's Health

Descriptive context on disease areas with documented disparities and public market-size estimates — for learning, not buy/sell recommendations:

Disease	Disparity (cited in docs)	Public market-size estimate
Maternal Health	Higher mortality disparity	$12B
Uterine Fibroids	High prevalence	$34B
Lupus	Higher prevalence	$8B
Sickle Cell Disease	Population concentration	$5B
Cardiovascular Disease	Higher mortality	$15B

See OAIS_METHODOLOGY.md for scoring limits.

Clinical Trials Integration

Live: /api/clinical-trials → ClinicalTrials.gov API v2 (search, batch lookup)
Curated M&A: unchanged — still dataset.verified.json

Do not conflate live trial search volume with verified deal coverage.

Genomics variant store (optional)

Large VCF/gVCF call sets use a two-tier layout (off by default on Vercel):

Tier	Technology	Contents
Object storage	Local `data/variants/` or S3	Multi-GB raw VCF blobs
Variant catalog	ClickHouse	Callset metadata + queryable variant summaries

Dashboard: VariantCallsetBrowser — browse callsets, filter by gene, presigned S3 download when configured
APIs: /api/genomics/callsets, /api/genomics/variants, /api/genomics/callsets/{id}/object
Ingest: npm run clickhouse:ingest-vcf — stream parser → object storage → batch INSERT
Docs: GENOMICS_VARIANT_STORE.md

docker compose up -d clickhouse
# .env.local: LACUNA_VARIANT_STORE=clickhouse, CLICKHOUSE_URL=http://lacuna:lacuna@localhost:8123
npm run clickhouse:migrate && npm run clickhouse:seed
npm run dev

Not clinical-grade genomics infrastructure — infrastructure demo with honest provenance labels.

Academic Frameworks

Six frameworks with explicit small-n limits documented in docs/ (causal DAG, fairness audit, network concentration, etc.). We state what cannot be claimed with n≈58 deals — see methodology files linked from the app.

Typography

The live app loads Playfair Display (Didone serif) via next/font/google and applies it app-wide — body copy, headings, and font-mono utilities share the same family for a high-contrast editorial look.

GitHub does not load custom web fonts. This README uses a Didone fallback stack (Didot, Bodoni MT, Georgia) so the page reads closer to the product on github.com. Only the live demo renders true Playfair Display.

Technology Stack

Layer	Used in production UI
Playfair Display (`next/font/google`)	App-wide Didone serif typography
Next.js 16, React 19, Tailwind v4	App shell
D3.js v7, Framer Motion	Visualization
simple-statistics	Descriptive stats / similarity / quant engine
Verified JSON (`getVerifiedDataset()`)	Default data path; static import for Vercel serverless
PostgreSQL	Optional `LACUNA_DATA_MODE=db`
ClickHouse + S3/local object storage	Optional variant call-set catalog (`LACUNA_VARIANT_STORE=clickhouse`)
Vercel AI Gateway + AI SDK	Optional narratives + SEC classification (INFERENCE.md)
TensorFlow.js	Quarantined — devDependency for Vitest only
Deno (CI)	`deno fmt` and `deno lint` in GitHub Actions

CI Status: deno fmt, deno lint, eslint, vitest (297 tests), next build + tsc all green on main.

Quick Start

git clone https://github.com/maekass/Lacuna.git
cd Lacuna
npm install
npm run dev
npm run validate:dataset
npm run infra:check
npm test

Open http://localhost:3000. Data loads from src/data/dataset.verified.json unless LACUNA_DATA_MODE=db is set and Postgres is provisioned.

Optional local Postgres: docker compose up -d → copy .env.example to .env.local → npm run db:migrate && npm run db:import. See INFRASTRUCTURE.md.

Optional variant store: docker compose up -d clickhouse → set LACUNA_VARIANT_STORE=clickhouse → npm run clickhouse:migrate && npm run clickhouse:seed. See GENOMICS_VARIANT_STORE.md.

Data Curation

Manual verification — no synthetic maDeals. Workflow: DATA_CURATION_CHECKLIST.md, npm run validate:dataset, optional npm run sec:scan.

Documentation

Doc	Purpose
MODEL_CARD.md	Start here — what each score is and is not
INFERENCE.md	Server-side LLM (AI Gateway)
DATA_CURATION_CHECKLIST.md	Schema, validation, staging
NETWORK_ANALYSIS_METHODOLOGY.md	Graph metrics, small-n
OAIS_METHODOLOGY.md	Health impact scoring limits
INFRASTRUCTURE.md	CI, Vercel, Postgres, cron, `/api/health`
PERFORMANCE.md	Bundle, caching, probe split, fan-out limits
GENOMICS_VARIANT_STORE.md	ClickHouse + object storage for large VCF catalogs
MONITORING.md	Uptime URL: `/api/health` only (not `/ready`)
PRODUCTION_SETUP.md	Vercel env vars and migrations
SEC_INGESTION.md	SEC EDGAR cron pipeline
SITE_ARCHITECTURE.md	Vercel product vs Framer marketing (no analytics in Framer)
framer/BUILD_GUIDE.md	Framer marketing site — copy, tokens, HTML prototype
AGENTS.md	Contributor conventions

License

BSL 1.1 — research/education production use allowed; Competitive Offerings (commercial women's-health M&A intelligence products) require a separate agreement. Converts to Apache 2.0 May 2030.

mps5cy@virginia.edu for commercial licensing.

Author

Mae Kass — open investment-research tools for women's health data literacy and honest analytics.

Name		Name	Last commit message	Last commit date
Latest commit History 205 Commits
.claude		.claude
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
.storybook		.storybook
.vscode		.vscode
__tests__		__tests__
clickhouse/migrations		clickhouse/migrations
data		data
db/migrations		db/migrations
docs		docs
e2e		e2e
public		public
scripts		scripts
services/ingest-worker		services/ingest-worker
src		src
staging		staging
.cursorrules		.cursorrules
.env.example		.env.example
.gitignore		.gitignore
.windsurfrules		.windsurfrules
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile.ingest		Dockerfile.ingest
LICENSE		LICENSE
LICENSE_VARIABLES.md		LICENSE_VARIABLES.md
README.md		README.md
deno.json		deno.json
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json
vercel.json		vercel.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lacuna

Table of Contents

Overview

What is Lacuna?

Live Demo

Core Features

Verified deal explorer

Interactive network (`ForceNetwork.tsx`)

Deal flow (`DealFlowChart.tsx`)

Valuation matrix (`ValuationMatrix.tsx`)

Exit-likelihood leaderboard (`QuantValuationPanel.tsx`)

Descriptive analytics (heuristics, not predictive ML)

Acquisition likelihood indicators (`ExitPredictor.tsx`)

Company similarity (`CompanySimilarity.tsx`)

Clustering (`ClusteringAnalysis.tsx`)

Optional server narratives (INFERENCE.md)

Health Equity & Black Women's Health

Clinical Trials Integration

Genomics variant store (optional)

Academic Frameworks

Typography

Technology Stack

Quick Start

Data Curation

Documentation

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Lacuna

Table of Contents

Overview

What is Lacuna?

Live Demo

Core Features

Verified deal explorer

Interactive network (ForceNetwork.tsx)

Deal flow (DealFlowChart.tsx)

Valuation matrix (ValuationMatrix.tsx)

Exit-likelihood leaderboard (QuantValuationPanel.tsx)

Descriptive analytics (heuristics, not predictive ML)

Acquisition likelihood indicators (ExitPredictor.tsx)

Company similarity (CompanySimilarity.tsx)

Clustering (ClusteringAnalysis.tsx)

Optional server narratives (INFERENCE.md)

Health Equity & Black Women's Health

Clinical Trials Integration

Genomics variant store (optional)

Academic Frameworks

Typography

Technology Stack

Quick Start

Data Curation

Documentation

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Interactive network (`ForceNetwork.tsx`)

Deal flow (`DealFlowChart.tsx`)

Valuation matrix (`ValuationMatrix.tsx`)

Exit-likelihood leaderboard (`QuantValuationPanel.tsx`)

Acquisition likelihood indicators (`ExitPredictor.tsx`)

Company similarity (`CompanySimilarity.tsx`)

Clustering (`ClusteringAnalysis.tsx`)

Packages