Where Do Your Citations Come From?
A free, open-source, no-code, and auditable tool for citation network decomposition with complementary BARON and HEROCON scores.
Citation-Constellation decomposes a researcher's citation profile by network proximity between citing and cited authors. Standard metrics like the h-index treat all citations as equal — a citation from an independent researcher carries the same weight as one from a direct co-author or departmental colleague. This tool makes the invisible structure visible.
Enter an ORCID or OpenAlex ID and receive a full multi-layer citation network decomposition within minutes. No installation, no registration, no payment required.
Try it now: citation-constellation.serve.scilifelab.se
A strict binary metric. Every citation is either inside the researcher's network (0 points) or outside (1 point). No partial credit.
BARON = (External citations ÷ Classifiable citations) × 100
A graduated weighted metric. In-group citations receive partial credit based on relationship proximity.
HEROCON = (Σ weights ÷ Classifiable citations) × 100
Gap = HEROCON − BARON
| Gap Size | Interpretation |
|---|---|
| < 3% | Most citations are fully external |
| 3–10% | Meaningful inner-circle contribution, external impact dominant |
| > 10% | Significant portion of impact from the collaborative network |
The gap is neither good nor bad. It characterizes citation structure, not quality.
Important: BARON and HEROCON measure citation network structure, not research quality, impact, or integrity. They should not be used for hiring, promotion, or funding decisions.
Citations are classified using a strict priority order (first match wins):
| Priority | Classification | Phase | Description | BARON | HEROCON Weight |
|---|---|---|---|---|---|
| 1 | SELF | 1 | Researcher is an author on the citing work | 0 | 0.0 |
| 2 | DIRECT_COAUTHOR | 2 | Citing author shared ≥1 publication with target | 0 | 0.2 |
| 3 | TRANSITIVE_COAUTHOR | 2 | Co-author of a co-author | 0 | 0.5 |
| 4 | SAME_DEPT | 3 | Same department, no co-authorship | 0 | 0.1 |
| 5 | SAME_INSTITUTION | 3 | Same university, different department | 0 | 0.4 |
| 6 | SAME_PARENT_ORG | 3 | Shared parent in ROR hierarchy | 0 | 0.7 |
| 7 | UNKNOWN | 3 | Insufficient metadata | excluded | excluded |
| 8 | EXTERNAL | 3 | No detected relationship | 1 | 1.0 |
HEROCON weights are experimental heuristics, not empirically calibrated values. Custom weights can be provided via CLI or the web interface.
Each phase adds a detection layer. Later phases incorporate all earlier layers.
| Phase | Detection Layer | Classes | Status |
|---|---|---|---|
| 1 | Self-citation (author ID match) | 2 | ✅ Available |
| 2 | Co-author network (BFS graph traversal, depth 1–3, temporal decay) | 4 | ✅ Available |
| 3 | Temporal affiliation matching (ROR hierarchy) | 7 | ✅ Available |
| 4 | Venue governance (local LLM extraction of editorial boards) | 11 | 🔧 Building |
| 5 | Field normalization and percentile ranks | — | 📋 Planned |
No installation required. Go to citation-constellation.serve.scilifelab.se and:
- Enter an ORCID (e.g.,
0000-0000-0000-0000) or OpenAlex ID (e.g.,A0000000000). Full URLs are accepted. - Optionally set a career start year, adjust co-author graph depth (default: 2), enable manual validation of flagged papers, or upload custom HEROCON weights.
- Click Run Analysis. Expect 1–4 minutes for ~50–100 publications.
- Explore results: score summary, classification donut chart, co-author network graph, career trajectory, full citation table with per-citation rationale.
- Download the JSON audit report for offline analysis or comparison.
Rate limit: 10 analyses per hour per session. Visualization of existing audit files has no limit.
Comparing researchers: Upload multiple audit JSON files in the View Existing Audits tab for side-by-side comparison tables and overlaid trajectory charts (up to 115 simultaneous comparisons).
git clone https://github.com/citation-cosmograph/citation-constellation.git
cd citation-constellation
pip install -r requirements.txt
# Full analysis (recommended)
python phase3.py --orcid 0000-0000-0000-0000 --trajectory
# Also accepts OpenAlex IDs
python phase3.py --openalex-id A0000000000 --trajectoryRequirements: Python 3.11+. No database needed for Phases 1–3.
# Prebuilt image (recommended)
docker pull mahbub1969/citation-constellation:v4
docker run --rm -it -p 7860:7860 mahbub1969/citation-constellation:v4
# Or build from source
docker build -t citation-constellation:v0.3 .
docker run --rm -it -p 7860:7860 citation-constellation:v0.3Open http://localhost:7860.
cd citation-constellation/
pip install -r app/requirements.txt
python app/main.py
# Open http://localhost:7860| Phase | Script | What It Detects |
|---|---|---|
| 1 | phase1.py |
Self-citations only → BARON v0.1 |
| 2 | phase2.py |
+ Co-author network → BARON v0.2 + HEROCON v0.2 |
| 3 | phase3.py |
+ Institutional affiliations → BARON v0.3 + HEROCON v0.3 |
Phase 3 is recommended for the most comprehensive analysis.
| Flag | Description |
|---|---|
--orcid ID |
Researcher ORCID identifier |
--openalex-id ID |
OpenAlex author ID |
--since YEAR |
Exclude works published before this year |
--depth 1|2|3 |
Co-author graph depth (default: 2) |
--trajectory / -t |
Compute cumulative career trajectory (required for trajectory chart in web viz) |
--confirm / -c |
Interactive review of ORCID-flagged works before scoring |
--herocon-weights FILE |
Custom HEROCON weight configuration (JSON) |
--export FILE |
Export summary to JSON |
--no-orcid-check |
Skip ORCID cross-validation |
--no-audit |
Skip audit file generation (not recommended) |
--verbose / -v |
Verbose output |
python phase3.py \
--orcid 0000-0000-0000-0000 \
--since 2010 \
--depth 2 \
--trajectory \
--confirm \
--herocon-weights weights.json \
--export results.jsonCreate a JSON file with any subset of classifications. Unspecified classifications use defaults.
{
"SELF": 0.0,
"DIRECT_COAUTHOR": 0.3,
"TRANSITIVE_COAUTHOR": 0.6,
"SAME_DEPT": 0.2,
"SAME_INSTITUTION": 0.5,
"SAME_PARENT_ORG": 0.8,
"EXTERNAL": 1.0
}python phase3.py --orcid 0000-0000-0000-0000 --confirmThe tool displays ORCID-flagged works with reasons and prompts for a decision. Input: all (exclude all), none (keep all), 1,3,5 (exclude specific items), or 1-3,5 (ranges).
Every analysis output begins with a prominent ethical disclaimer, reinforcing that BARON and HEROCON measure citation network structure, not research quality, impact, or integrity.
The score panel presents BARON and HEROCON scores alongside key summary statistics: total citations, classifiable citations, the BARON–HEROCON gap, and a data quality reliability rating.
The donut chart provides a proportional breakdown of citation origins across all classification categories, with BARON and HEROCON scores displayed in the center.
Each citation category with its count, percentage of classifiable citations, and the HEROCON weight applied.
Interactive force-directed network. The target researcher appears as a gold node, direct co-authors in crimson (sized by shared publications), and transitive co-authors in blue. Hover any node for details. Networks exceeding 150 nodes are automatically pruned.
Cumulative BARON and HEROCON scores over time as dual lines, with a shaded region representing the gap. Stacked bars beneath show annual citation volume.
Every individual citation with its classification, confidence level, detection phase, and a human-readable rationale. This is the audit trail made visible — any classification can be inspected and contested.
Side-by-side structural analysis of multiple researchers from uploaded audit files. Researcher names below are anonymized.
Both CLI and web interface produce identical scores and audit trails. The web interface additionally renders:
- Score Summary — BARON, HEROCON, gap, total/classifiable citations, reliability rating
- Classification Donut Chart — Proportional breakdown with scores in center
- Co-Author Network Graph — Interactive force-directed graph (gold = target, crimson = direct co-authors, blue = transitive)
- Career Trajectory Chart — Dual-line BARON/HEROCON over time with shaded gap area
- Full Citation Table — Every citation with classification, confidence, phase, and human-readable rationale
- Comparison View — Side-by-side analysis from multiple uploaded audit files
Every run generates a timestamped JSON audit file in ./audits/ documenting:
- Researcher profile and all works analyzed
- Every citation link and classification decision with human-readable rationale
- Co-author graph (nodes, edges, distances, collaboration strength)
- Affiliation timeline and ROR institution hierarchy
- ORCID validation results
- Score breakdown and career trajectory data
CLI-generated audit files can be uploaded to the web interface for visualization without re-running computation.
Note: Include
--trajectory/-twhen generating audit files via CLI if you want the career trajectory chart in the web interface. All other visualizations work regardless.
Citations with insufficient metadata are classified as UNKNOWN and excluded from both BARON and HEROCON denominators, preventing missing data from artificially inflating or deflating scores.
| Quality (% classifiable) | Rating | Interpretation |
|---|---|---|
| ≥ 85% | HIGH | Scores are trustworthy |
| ≥ 70% | MODERATE | Reasonable, treat with care |
| ≥ 50% | LOW | Significant data gaps |
| < 50% | VERY LOW | Too much missing data for reliable scoring |
The tool cross-validates OpenAlex works against ORCID records using a smart two-mode system:
- High ORCID coverage (≥70%): ORCID is a hard filter — only works in both ORCID and OpenAlex enter scoring.
- Low ORCID coverage (<70%): All OpenAlex works kept, but affiliation anomaly detection flags suspicious works.
If the publication span exceeds 25 years, a warning suggests using --since YEAR.
| Depth | In-group definition | Best for |
|---|---|---|
| 1 | Direct co-authors only | Large, loosely connected fields |
| 2 | + Co-authors of co-authors (default) | Most researchers |
| 3 | + Three hops | Small, tightly-knit fields |
Higher depth → larger in-group → lower BARON score.
OpenAlex occasionally merges works from different researchers with similar names. Signs: publication span warning (>25 years), unexpected institutions, inflated publication count.
Solutions:
- Use
--since YEARto exclude pre-career publications - Use
--confirmto interactively review and exclude misattributed works - Ensure the researcher's ORCID profile is up to date (≥70% coverage enables hard filtering)
pulsar→astrolabe→constellationthe signal — the instrument — the map
| Component | Purpose | Links |
|---|---|---|
| Citation-Constellation | BARON & HEROCON scoring | Live Tool · Source |
| Citation-Pulsar-Helm | LLM inference on Kubernetes | Source |
| Citation-Astrolabe | Venue governance database | Source |
github.com/citation-cosmograph
- Language: Python 3.11+
- CLI: Typer, Rich, httpx
- Web Interface: Gradio
- Phase 4 (in development): SQLite/PostgreSQL, Qwen 3.5 8B (Q4_K_M via llama.cpp on Kubernetes)
- Data Sources (all free and open): OpenAlex (260M+ works, 100M+ authors, 2.8B+ citation links), ORCID Public API v3.0, ROR API v2, Cloudflare Crawl API (Phase 4)
Performance: For ~80 publications and ~1,500 citations, Phase 3 completes in under 90 seconds with ~100–150 OpenAlex API calls.
Detects citations from venues where the researcher or their network holds editorial/committee roles. Uses a locally deployed LLM (Qwen 3.5 8B) to extract structured governance data from venue websites. Four new classification types: VENUE_SELF_GOVERNANCE (weight 0.05), VENUE_EDITOR_COAUTHOR (0.15), VENUE_EDITOR_AFFIL (0.3), VENUE_COMMITTEE (0.4).
Field-normalized percentile ranks against peer cohorts (same field, career length, publication volume). Confidence intervals via bootstrap resampling. REST API for programmatic access.
Sensitivity analysis of HEROCON weights, UNKNOWN imputation strategies, ground-truth cross-validation against citation motivation surveys, rolling temporal trajectories, multi-source data fusion (Semantic Scholar, CrossRef, DBLP), and empirical weight calibration.
- Coverage bias: OpenAlex is English-heavy and recent-heavy. Older or non-English publications may have lower coverage.
- Small-field problem: In small communities, nearly everyone is a co-author's co-author at depth 2, producing low BARON scores that reflect field size, not citation practice.
- HEROCON weights are experimental: Not empirically calibrated. Diverse profiles are robust to perturbation; concentrated profiles are sensitive.
- UNKNOWN creates a conditioned sample: If UNKNOWN citations are systematically different from classifiable ones, computed scores reflect a biased subset.
- Correlation, not causation: A DIRECT_COAUTHOR citation may be genuinely content-motivated. The tool detects structural potential for network-mediated citation, not actuality.
- Department-level matching is noisy: ROR lacks department-level identifiers for most institutions.
For the full methodology, conceptual foundations, tool landscape comparison, discussion of responsible research assessment alignment, and detailed limitations analysis, see the accompanying research paper:
Mahbub Ul Alam. Where do your citations come from? Citation-Constellation: A free, open-source, no-code, and auditable tool for citation network decomposition with complementary BARON and HEROCON scores, 2026. URL: https://arxiv.org/abs/2603.24216, arXiv:2603.24216, doi:10.48550/arXiv.2603.24216.
The paper is also available embedded within the web tool under the Full Research Paper tab.
@misc{alam2026citationconstellation,
title={Where Do Your Citations Come From? {C}itation-{C}onstellation: A Free, Open-Source, No-Code, and Auditable Tool for Citation Network Decomposition with Complementary {BARON} and {HEROCON} Scores},
author={Mahbub Ul Alam},
year={2026},
eprint={2603.24216},
archivePrefix={arXiv},
primaryClass={cs.DL},
url={https://arxiv.org/abs/2603.24216},
doi={10.48550/arXiv.2603.24216}
}Built on SciLifeLab Serve. Powered by OpenAlex, ORCID, and ROR.
MIT


















