Skip to content

yeevon/DSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

184 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS 300 — Data Structures & Algorithms

LaTeX lecture notes, compact study notes, and coach-prompt exercises for an undergraduate Data Structures & Algorithms course. Source material is taught in pseudocode; this rewrite ports everything to C++ (C++17 idioms; std::vector, std::list, std::unordered_map, etc.) and is supplemented with material aimed at real DSA mastery rather than course-passing.

Status — M1 ✅ M2 ✅ M3 ✅ M-UX ✅ M4 ✅ M5 ✅ M-Remediation ✅ closed through 2026-05-04 (T01–T27 ✅; host-gate validation passed operator pre-merge run 2026-05-04 — Selenium 75/75 + smoke chain through assertion 15). M6 is next. All six SNHU-required chapters (ch_1–ch_6) augmented with CLRS + MIT OCW material; site migrated from Jekyll to Astro 6, deployed via GitHub Actions to https://yeevon.github.io/DSA/ (now 40 prerendered pages = 36 chapter routes + 3 collection-landing pages + 1 dashboard index). Content pipeline: chapters/*.tex → pandoc 3.1.3 + a small Lua filter → src/content/{lectures,notes,practice}/*.mdx → Astro static build with KaTeX math + Shiki syntax highlighting + a 5-component callout library matching the LaTeX source 1:1. State service (M3) runs locally via SQLite (Drizzle ORM) + @astrojs/node adapter; annotations + read-status surfaces ride a data-interactive-only CSS contract that hides them on the public deploy and lights them up in local mode (where the companion aiw-mcp server from M4 will eventually run). M-UX (UI/UX polish, T1–T9) rebuilt the chrome from a 51-line bare shell into the MDN-docs three-column layout: left-rail chapter nav (collection-aware), sticky breadcrumb with collection switcher

  • prev/next + functional links to collection-landing pages, right-rail in-chapter TOC (top-level numbered sections) with IntersectionObserver scroll-spy, mobile drawer below 1024px (CSS-only single-DOM-tree per T7 cycle 2), mastery-dashboard placeholder index. M3 surfaces (annotations pane, mark-read button, read-status indicators, section nav) re-homed into the new chrome with all four event contracts (cs300:read-status-changed, cs300:toc-read-status-painted, cs300:annotation-added, cs300:drawer-toggle) preserved. M-UX added a Selenium verification harness (scripts/smoke-screenshots.py + scripts/functional-tests.py, headless Chrome with isolated /tmp/cs300-smoke-* profile) that runs 19 functional-test cases / 30 assertions on every code-task gate; caught + fixed a real sticky-breadcrumb regression at milestone close. Optional chapters (ch_7, ch_9–ch_13) ship as committed-but- un-augmented; deeper review is deferred to the post-build content audit. M4 (question generation) delivered the question_gen workflow module (Ollama Qwen 14B, all four question types) + the cs300 package skeleton. M5 (FSRS spaced repetition, in-browser review loop) ✅ closed 2026-05-03. M-Remediation closes the audit-substrate degradation surfaced during M4/M5 close-out: production-bug fixes, state-isolation hardening, browser delta-coverage smoke, doc reconciliation, npm script polish, and pending decisions. Two M-UX residuals (collapsible chapter sections; CompletionIndicator JSON → GET /api/sections endpoint, ~432 KB savings) parked in design_docs/nice_to_have.md §UX-2 / §UX-4. See design_docs/milestones/README.md for the full milestone index and design_docs/architecture.md for the system design.

What this is

Two things in one repo:

  1. Course notes. A rewrite of standard DSA material — arrays, lists, hash tables, trees, graphs, sorts — restructured around first principles, with C++ context, gotchas, and worked examples. Six chapters (7, 9, 10, 11, 12, 13) extend beyond the official course path for depth.

  2. Reference integration for jmdl-ai-workflows. The planned interactive features — question generation, retry / validator pairing, cost tracking, evals replay — all come from a separate framework. cs-300 contributes its domain-specific workflow modules (under ./workflows/) and runs them via the framework's aiw-mcp MCP server. cs-300 is the proving ground that demonstrates the framework does useful work end-to-end.


Local development

First-time setup

After cloning, run both npm install and uv sync on your host before launching the Docker sandbox. node_modules/, .venv/, and .astro/ are all shared between host and sandbox via the workspace bind mount — none are docker-named volumes (per Q-SBX-01 + the M-Remediation T01 follow-up: the named-volume layout produced root-owned directories on first docker compose up and broke every sandbox write under the node user). The host install is the source of truth; the sandbox reads through the bind mount.

npm install                           # one-time, on host (NOT in sandbox)
uv sync                               # one-time, on host (NOT in sandbox)

.astro/ regenerates on every Astro build; no host action needed beyond letting it land in the workspace.

If you fork or clone fresh inside a sandbox-only setup, drop the node_modules: / venv: / astro_cache: named-volume entries from docker-compose.yml (already done in this repo) and either run the installs from a host shell with the same UID, or wire an entrypoint that chowns the bind-mounted directories to the container user.

Two-process setup

The interactive review loop requires both the Astro state service (Node) and the aiw-mcp workflow server (Python). The canonical way to start both together is:

Proxy note (M-Remediation T01): astro dev proxies /aiw/* to http://127.0.0.1:8080/* via the Vite dev-server proxy in astro.config.mjs, so browser fetch calls to /aiw/mcp reach aiw-mcp as same-origin requests and the browser can read the Mcp-Session-Id response header without a CORS issue.

npm run dev

This uses concurrently to launch both astro dev (Astro on port 4321) and bash scripts/aiw-mcp.sh (aiw-mcp on port 8080) in parallel. Astro is labelled blue, aiw-mcp labelled magenta in the terminal output.

If you want to start only one process (e.g. Astro is already running and you want to restart aiw-mcp only):

npm run dev:astro-only    # Astro only
npm run dev:aiw-only      # aiw-mcp only

LLM runtime + model

cs-300's LLM stack is Ollama (locked) + a configurable model (per ADR-0003).

  • OLLAMA_HOST — controls where aiw-mcp.sh and integration-smoke.sh probe Ollama. Default 127.0.0.1:11434. Set to host.docker.internal:11434 when running aiw-mcp inside the sandbox against host-side Ollama (already wired into docker-compose.yml).
  • OLLAMA_MODEL — overrides the default per-workflow model in cs300/workflows/{assess,grade,question_gen}.py. Defaults are pinned per module: assess uses qwen2.5-coder:32b; grade and question_gen use qwen2.5:14b. Setting OLLAMA_MODEL=<full-litellm-id> swaps all three. Per-workflow overrides (OLLAMA_MODEL_ASSESS, OLLAMA_MODEL_GRADE, OLLAMA_MODEL_QUESTION_GEN) take precedence when set.

Prerequisites:

  • Ollama running at OLLAMA_HOST (default 127.0.0.1:11434). scripts/aiw-mcp.sh probes on startup and exits immediately with a clear message if not reachable.
  • The relevant models pulled. For the defaults:
    ollama pull qwen2.5-coder:32b
    ollama pull qwen2.5:14b
  • If using a custom OLLAMA_MODEL, pull that one instead.

Sandbox (Docker)

The Docker sandbox provides a Linux toolchain with Node 22, pandoc 3.1.3, Python 3.12 + uv, and Claude Code, all pinned. See docker-compose.yml and Dockerfile for the full layout. Entry point: make shell.

The sandbox networks to host-side Ollama via host.docker.internal:11434 (extra_hosts is set to map it to the host gateway on Linux). aiw-mcp can run either on the host or inside the sandbox; both flows use the same OLLAMA_HOST env var.

For browser-driven smokes (python scripts/functional-tests.py), the sandbox adds a selenium/standalone-chrome sibling service and _selenium_helpers.py switches to a remote WebDriver path when CS300_SELENIUM_REMOTE=1 is set (automatically wired in docker-compose.yml).

Audio narration

Chapter narration is pre-generated TTS audio (OpenAI gpt-4o-mini-tts, voice fable) paired with a sentence-timestamp JSON file for in-player sentence highlighting.

Mixed storage strategy (per ADR-0006):

  • ch_3 (demo chapter): public/audio/ch_3.mp3 + ch_3.timestamps.json are committed to the repo so GH Pages serves them statically.
  • All other chapters are gitignored. They are generated locally and played from the local Astro dev server only — they never reach the public deploy.

Setup: copy .env.example to .env and set your OpenAI key:

cp .env.example .env
# Edit .env and replace sk-replace-with-your-key with your real key.

Before generating, set a monthly hard cap on your OpenAI org dashboard to avoid runaway charges: https://platform.openai.com/settings/organization/limits

Generate audio for a chapter:

node scripts/generate-audio.mjs ch_3

This writes public/audio/ch_3.mp3, public/audio/ch_3.timestamps.json, and public/audio/ch_3.hash (idempotency sidecar). Re-running on unchanged source skips the API call (hash matches). Pass --force to regenerate regardless.


Pre-merge-to-main checklist

Before merging design_branchmain (which triggers the GH Pages deploy):

Verification

Run npm run pre-merge — this executes check + build + smoke:host and is the single source of truth for the verification recipe. All steps must exit 0.

Merge ceremony

Follow .claude/skills/ship/SKILL.md (operator-only). The skill owns the dist secret-scan, user approval, and the actual merge to main.


How to use it (today)

Read the rendered PDFs (one per chapter under chapters/ch_N/), or browse the GitHub Pages site as a static viewer. Nothing interactive yet.

Build a chapter's PDFs:

cd chapters/ch_4
pdflatex -interaction=nonstopmode -halt-on-error lectures.tex
pdflatex -interaction=nonstopmode -halt-on-error notes.tex

Required TeX Live packages: geometry, parskip, xcolor, pagecolor, hyperref, titlesec, enumitem, listings, tcolorbox (most), titling, multicol, array. Optional but recommended: libertine, inconsolata (the preamble falls back gracefully if missing).


Repository layout

cs-300/
├── notes-style.tex                  # shared LaTeX preamble (callout boxes, listings styles)
│
├── chapters/                        # source of truth — one folder per chapter
│   ├── ch_1/  (required)            # programming basics, arrays, vectors
│   ├── ch_2/                        # algorithms, recursion, greedy, DP
│   ├── ch_3/                        # ADTs, Big-O, sorting
│   ├── ch_4/                        # lists, stacks, queues, deques
│   ├── ch_5/                        # hash tables
│   ├── ch_6/                        # trees and BSTs
│   ├── ch_7/  (optional)            # heaps and priority queues
│   ├── ch_9/                        # AVL and red-black trees
│   ├── ch_10/                       # graphs
│   ├── ch_11/                       # B-trees
│   ├── ch_12/                       # sets
│   └── ch_13/                       # extra sorts and list idioms
│
├── coding_practice/                 # prompt corpus consumed by cs-300 workflow modules
│   ├── cplusplus/
│   ├── psuedo/
│   └── python/
│
├── cs300/                           # Python package — cs-300 workflow modules (M4)
│   └── workflows/                   # question_gen, assess, grade — registered into aiw-mcp
│
├── design_docs/
│   ├── architecture.md              # architecture of record
│   ├── roadmap_addenda.md           # canonical operational roadmap
│   ├── nice_to_have.md              # deferred parking lot
│   ├── adr/                         # architecture decision records
│   ├── chapter_reviews/             # per-chapter Step-1 inventories + Step-2 gap reports
│   └── milestones/                  # milestone specs + task breakouts + issue logs
│
├── agent_docs/                      # long-running pattern docs for Claude Code workflows
│
├── drizzle/                         # Drizzle ORM migration SQL files (M3)
│
├── runs/                            # long-running task artifacts (plan.md / progress.md / cycle summaries)
│
├── tmp/                             # session-scoped scratch: analysis files, remediation plans
│
├── src/                             # Astro app (M2)
│   ├── pages/                       # routes: index + dynamic [id].astro per collection
│   │   └── api/                     # local-only state service routes (not in dist/)
│   ├── content/                     # generated MDX (gitignored; regenerated by prebuild)
│   ├── content.config.ts            # collection schemas (lectures, notes, practice)
│   ├── layouts/                     # Base.astro shell
│   └── components/callouts/         # 5 LaTeX-env components + CodeBlock
│
├── public/                          # Astro static assets
│   └── audio/                       # M7 audio drop site (currently empty + .gitkeep)
│
├── scripts/
│   ├── pandoc-filter.lua            # MDX-friendly LaTeX → markdown filter
│   ├── build-content.mjs            # prebuild: pandoc each chapter → src/content/*.mdx
│   ├── chapters.json                # per-chapter metadata (title, subtitle, n, required)
│   └── *-smoke.mjs                  # node smoke harnesses (db, annotations, read-status, seed, …)
│
├── .github/workflows/deploy.yml     # Astro build + deploy-pages
├── .claude/                         # subagent prompts, slash commands, skills (committed)
├── astro.config.mjs                 # site, base, MDX integration, math plugins
├── tsconfig.json
├── package.json
├── pyproject.toml / uv.lock         # Python (cs300 workflow modules)
├── Dockerfile / docker-compose.yml  # filesystem-scoped agent sandbox
├── .nvmrc                           # Node 22
└── .pandoc-version                  # pandoc 3.1.3

What each chapter folder contains

Every chapters/ch_N/ holds:

  • lectures.tex / lectures.pdf — the full chapter. Opens with a chapter map (where this sits in the course, what you'll add to your toolkit, 7-item mastery checklist) and closes with a cross-reference box pointing forward.
  • notes.tex / notes.pdf — compact two-page reference: cost tables, common patterns, gotchas. Pre-exam / pre-assignment review, not a substitute for the lectures.
  • practice.md — twelve self-contained coach prompts. Paste any one into a fresh LLM session; it makes you work through problem → pseudocode → C++ → critique, withholding answers until you submit. Each file ends with a meta-drill for timed practice.

Architecture

The system is static-by-default. Public deploy is just static HTML. Two local-only sibling processes light up the interactive features when present; if they're absent, the UI degrades cleanly to read-only:

  • aiw-mcp (Python) — the MCP server shipped by jmdl-ai-workflows, orchestrating cs-300's workflow modules from ./workflows/ over the local Ollama (Qwen) tier.
  • State service (Node) — Astro API routes under src/pages/api/ owning local SQLite via Drizzle.

Full design in design_docs/architecture.md. Operational plan (milestones M1–M7, each with its own README and task breakouts) in design_docs/milestones/. The phased roadmap lives in Google Drive (interactive_notes_roadmap.md); local addenda (sequencing, deferred decisions, Phase 1 acceptance criteria) in design_docs/roadmap_addenda.md.

Settled tech worth flagging up front:

  • Site: Astro (post-Phase-2), static output for GitHub Pages.
  • Content build: pandoc + a Lua filter, chapters/*.texsrc/content/*.mdx.
  • State: SQLite (Drizzle ORM), local-only.
  • Bridge: aiw-mcp (jmdl-ai-workflows ≥0.2.0, Python ≥3.12) over the streamable-HTTP transport on port 8080, browser ↔ cs-300 workflow modules. v0.2.0 (2026-04-24) shipped the AIW_EXTRA_WORKFLOW_MODULES external-workflow-module discovery feature cs-300 was waiting on; M4 unblocked 2026-04-25.
  • Scheduling: FSRS via ts-fsrs.
  • Audio: pre-generated TTS MP3s + sentence-timestamp JSON.
  • Question generation: local Ollama; no cloud LLM APIs at runtime.

Conventions

  • Style preamble (notes-style.tex) owns all colors and callout boxes. Don't redefine them per chapter.
  • Callout boxes:
    • defnbox — definitions
    • ideabox — key ideas, intuitions
    • warnbox — gotchas / things to watch for
    • examplebox — worked examples
    • notebox — asides, tangents, further reading
  • Code listings go in lstlisting with language=C++ when applicable.
  • Chapter bookends: chapter-opening map and end cross-reference box; see chapters/ch_1/lectures.tex for the template.
  • Chapters 7, 9, 10, 11, 12, 13 are deliberately outside the course's required sequence — extra depth, not part of the graded path.

Notes-writing principle

These notes are a rewrite, not a copy. Source material is read, restructured around first principles, and supplemented with C++ context, real-library notes (std::sort, std::unordered_map, etc.), and the gotchas worth remembering on a second pass. Scrape-and-paste from source defeats the point.


License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0) for everything in this repo — content and code alike. See LICENSE for the full canonical legal text and the scope statement.

This is personal course material that augments the SNHU CS 300 syllabus with concepts referenced from MIT OpenCourseWare 6.006 and CLRS. Non-commercial use only. No multi-user deployment. No cloud LLM APIs at runtime — question generation runs against local Ollama.


Changelog

See CHANGELOG.md for everything that's changed, including small ops.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors