What’s inside
Prototype KB (dsrs_kb_source.csv), a NetworkX graph converted to an
interactive PyVis HTML (dsrs_graph.html), and a fully-reproducible notebook
(DSRS_KB_Prototype.ipynb).
How to run
- Open
DSRS_KB_Prototype.ipynband run Runtime ▸ Run all - When execution finishes, double-click
dsrs_graph.htmlin the file browser to explore the graph.
Key files
| File | Purpose |
|---|---|
dsrs_kb_source.csv |
Source rows for the knowledge-base (23 rows) |
dsrs_graph.html |
Interactive graph (23 nodes / 20 edges) |
DSRS_KB_Prototype.ipynb |
Notebook that builds the CSV, graph and demo queries |
Sources
All KB rows cite:
DSRS Strategy.pdf- Public pages on dsrs.illinois.edu
- LinkedIn post permalinks (last 6 months)
| Task | Owner | Frequency | Tooling |
|---|---|---|---|
Automated scrape of DSRS website and LinkedIn (new posts, new datasets) → append to dsrs_kb_source.csv |
Infrastructure | nightly cron on DSRS server | Python + GitHub Actions |
CSV diff review: check duplicates / stale rows flagged by last_updated > 90 days |
Sub-unit leads | monthly | notebook validator |
| Manual additions (new projects, workshop decks) | Services & Data Hub staff | ad-hoc | KB template form (Google Form → Zapier → CSV row) |
- All rows include a
sourceURL/PDF; rows without a source fail the CI check. last_updatedtimestamp is overwritten on every save so ageing rows are obvious.- Graph build step refuses to run if duplicate
idvalues are detected.
- New project kickoff → PM completes a 5-line KB form; CI merges it automatically.
- LinkedIn post → Zapier webhook auto-creates a
newsrow withpost_date. - Dataset on-boarding in Data Hub playbook now ends with “add dataset to KB” step.
- Monthly DSRS all-hands starts with the KB growth chart so gaps are visible.
- Migrate CSV to a lightweight Postgres table exposed via REST.
- Build a Neo4j mirror for advanced graph queries.
- Embed a Streamlit KB search app on the DSRS intranet.
- Add unit tests to guarantee schema consistency before every merge.