Skip to content

Sonai124/career-intelligence-postgres-api

Repository files navigation

Career Intelligence Platform

A production-style SQL portfolio project that upgrades a resume analyzer into a PostgreSQL-backed job intelligence platform for technical and scientific careers.

This version is alligned with my other projects in the following domains:

  • quantum networking and scheduling
  • experiment automation and benchmarking
  • QAOA / QEC research prototypes
  • FastAPI, Docker, Linux, CI/CD
  • ML/NLP-assisted application tooling

Instead of stopping at notebook-style similarity scoring, this repo models a complete data product:

  • PostgreSQL as the system of record
  • Alembic migrations for schema evolution
  • repeatable ingestion scripts for jobs and resume versions
  • SQL analytics views/materialized views
  • FastAPI endpoints for ingestion + reporting
  • Streamlit dashboard for quick demos
  • pytest for core scoring logic
  • Docker Compose for local startup

Why this project exists

Some of the features are as follows,

  1. raw documents are ingested
  2. data is normalized into relational tables
  3. skills are extracted into reusable feature tables
  4. matching runs are versioned
  5. reports are exposed through SQL views and an API

That makes it much more credible for backend, data, analytics engineering, platform, or technical product roles.

Product concept

Career Intelligence Platform answers questions like:

  • Which jobs fit the current CV best?
  • Which missing skills appear most often by role family?
  • How does fit change after a new resume version?
  • Which roles best match a background in quantum software, experiment automation, scientific ML, and backend systems?

Architecture

flowchart LR
    A[data/sample_resume.md or uploaded resume] --> B[scripts/ingest_resume.py]
    C[data/sample_jobs.csv or external jobs CSV] --> D[scripts/ingest_jobs.py]
    B --> E[(PostgreSQL)]
    D --> E
    E --> F[match_runs + job_matches + match_skill_gaps]
    F --> G[analytics views / materialized views]
    E --> H[FastAPI]
    H --> I[Streamlit dashboard]
Loading

Schema highlights

Core tables:

  • candidate_profiles
  • resume_versions
  • skills
  • skill_aliases
  • resume_skill_mentions
  • companies
  • role_families
  • job_postings
  • job_skill_requirements
  • match_runs
  • job_matches
  • match_skill_gaps
  • application_events

Professional touches included:

  • primary / foreign keys
  • uniqueness constraints
  • indexed timestamps
  • role family normalization
  • weighted skill requirements
  • analytics views in a dedicated schema

See docs/schema.md for the detailed model.

Repo layout

career-intelligence-postgres-api/
  app/
  alembic/
  dashboard/
  data/
  docs/
  scripts/
  sql/
  tests/
  docker-compose.yml
  README.md
  requirements.txt

Quick start

1. Start services

cp .env.example .env
docker compose up --build

2. Run database migration

docker compose exec api alembic upgrade head

3. Seed sample data

docker compose exec api python scripts/seed_data.py
docker compose exec api python scripts/create_views.py

4. Open the apps

  • FastAPI: http://localhost:8000/docs
  • Streamlit: http://localhost:8501

Local development without Docker

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
alembic upgrade head
python scripts/seed_data.py
uvicorn app.main:app --reload

Example API calls

Ingest a new resume version

curl -X POST http://localhost:8000/api/v1/resumes/ingest \
  -H "Content-Type: application/json" \
  -d '{
        "file_path": "data/sample_resume.md",
        "version_label": "industry-v2"
      }'

Ingest jobs from CSV

curl -X POST http://localhost:8000/api/v1/jobs/ingest \
  -H "Content-Type: application/json" \
  -d '{"csv_path": "data/sample_jobs.csv"}'

Run matching for a resume version

curl -X POST http://localhost:8000/api/v1/match/run \
  -H "Content-Type: application/json" \
  -d '{"resume_version_id": 1}'

Get latest matches

curl "http://localhost:8000/api/v1/reports/latest-matches?limit=5"

Get top missing skills for quantum software roles

curl "http://localhost:8000/api/v1/reports/top-missing-skills?role_family=quantum-software&limit=10"

Example SQL

Latest job matches

SELECT *
FROM analytics.v_latest_job_matches
ORDER BY score DESC
LIMIT 10;

Top missing skills by role family

SELECT role_family_slug, skill_name, missing_count
FROM analytics.v_top_missing_skills
WHERE role_family_slug = 'quantum-software'
ORDER BY missing_count DESC, total_missing_weight DESC
LIMIT 10;

Average fit by role family

REFRESH MATERIALIZED VIEW analytics.mv_role_family_fit;

SELECT role_family_slug, avg_score, jobs_evaluated
FROM analytics.mv_role_family_fit
ORDER BY avg_score DESC;

Notes

  • The scoring logic is explainable and intentionally transparent.
  • The SQL model is stronger than a single-script ML demo because it preserves history and supports reporting.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages