Skip to content

alezenonos/jobscy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JobsCY — Cyprus Job Market Visualiser

CI Python 3.10+ Ruff Coverage 46% Licence MIT Deployed on Vercel

An interactive treemap visualisation of the Cyprus labour market, with AI exposure scoring for occupations. Built on data from EU/Cyprus sources including HRDA/AnAD, Eurostat, and EURES Cyprus.

Created and maintained by Alexandros Zenonos, adapted from karpathy/jobs for the Cyprus labour market using EU data sources and international classifications (ISCO-08, NACE, ISCED).

What's here

An interactive treemap of 39 ISCO-08 occupation groups covering the Cyprus labour market. Each rectangle's area is proportional to employment and colour shows the selected metric — toggle between growth outlook, median pay (EUR), education level, and AI exposure.

Data is sourced from the Eurostat SDMX REST API (employment by ISCO-08 2-digit, earnings by ISCO-08 1-digit, filtered to geo=CY) and enriched with HRDA 2022-2032 occupation forecasts.

Data sources

Source Dataset What it provides Link
Eurostat LFSA_EGAI2D Employment by ISCO-08 2-digit, filtered to Cyprus (2024) SDMX API docs
Eurostat earn_ses_hourly Mean gross hourly earnings by ISCO-08 1-digit, Cyprus (SES 2022) Dataset explorer
HRDA/AnAD 309 occupation forecasts (2022-2032), expansion + replacement demand anad.org.cy
EURES Shortage/surplus occupations, vacancy statistics by ESCO occupation eures.europa.eu
CEDEFOP Skills forecasts to 2035 by sector and occupation group cedefop.europa.eu
CYSTAT Aggregate employment, unemployment, labour costs cystat.gov.cy

Earnings methodology

Salary figures are derived from the Eurostat Structure of Earnings Survey (earn_ses_hourly), the official EU-wide enterprise survey on wages:

  • Granularity: SES provides earnings at ISCO 1-digit (major group) level only — all 2-digit occupations under the same group share the same figure (e.g. all Professionals = €36,878/year)
  • Conversion: Hourly earnings × 2,080 hours (40 h/week × 52 weeks) = annual pay
  • Coverage: Enterprises with 10+ employees — excludes self-employed, micro-firms, and the informal economy
  • Survey year: 2022 (SES is published every 4 years; next wave: 2026)
  • Metric: Mean gross hourly earnings in EUR (MEAN_E_EUR)

These are statistical averages, not market salaries. Actual compensation for specific roles in Cyprus may differ significantly due to sector, firm size, seniority, and private-sector premiums not captured by SES.

Architecture

Cyprus pipeline:
  ISCO-08 data ──► generate_cy_occupations.py ──► occupations_cy.json
  Eurostat API ──► make_cy_csv.py ─────────────► occupations_cy.csv
                                                        │
  LLM (OpenRouter) ──► score.py ──► scores.json ◄──────┘
                                         │
                     build_site_data.py ──┘──► site/data.json ──► site/index.html

LLM-powered colouring

The repo includes a pipeline for scoring occupations using LLMs via OpenRouter. You write a prompt, the LLM scores each occupation, and the treemap colours accordingly. The "Digital AI Exposure" layer estimates how much current AI will reshape each occupation within the Cyprus/EU labour market context (considering tourism, shipping, financial services, EU Digital Decade targets).

Fork score.py to write your own scoring criteria — e.g. green economy relevance, remote work potential, EU Digital Decade alignment.

What "AI Exposure" is NOT:

  • It does not predict that a job will disappear
  • It does not account for demand elasticity, regulatory barriers, or social preferences
  • The scores are LLM estimates, not rigorous predictions

Key files

File Description
eurostat.py Eurostat SDMX 2.1 REST API client (employment + earnings data)
generate_cy_occupations.py Generate occupations_cy.json from ISCO-08 classification
make_cy_csv.py Build occupations_cy.csv from Eurostat data (EUR, ISCO-08)
score.py LLM-based AI exposure scoring via OpenRouter (ISCO-08 / Cyprus context)
build_site_data.py Merge CSV + scores → site/data.json
make_prompt.py Generate single-file LLM prompt from all data
site/index.html Interactive treemap visualisation (EUR, ISCO-08, Cyprus)
occupations_cy.json Master list of 39 ISCO-08 occupation groups
occupations_cy.csv Summary stats: pay (EUR), employment, education, ISCO codes
scores.json AI exposure scores (0-10) with rationales

Setup

uv sync                  # install dependencies
uv sync --extra dev      # includes pytest, ruff for development

Requires an OpenRouter API key in .env (for LLM scoring only):

OPENROUTER_API_KEY=your_key_here

Usage

# Generate occupation list from ISCO-08
uv run python generate_cy_occupations.py

# Fetch Eurostat data and build CSV
uv run python make_cy_csv.py

# Score AI exposure (uses OpenRouter API)
uv run python score.py

# Build website data
uv run python build_site_data.py

# Generate LLM analysis prompt
uv run python make_prompt.py

# Serve the site locally
cd site && python -m http.server 8000

Development

uv run pytest -v           # run tests (110+ tests)
uv run ruff check .        # lint
uv run ruff format .       # auto-format

CI runs automatically on push via GitHub Actions (lint + test + site validation). See CONTRIBUTING.md for development guidelines.

Deployment

The site is deployed as a static site on Vercel. Configuration is in vercel.json:

  • Output directory: site/
  • No build step — the site is pre-built and committed to the repository
  • Security headers: HSTS, CSP, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy
  • Caching: data.json is cached for 1 hour with 24-hour stale-while-revalidate

Currency and classifications

  • Currency: EUR (€) — all monetary values
  • Occupations: ISCO-08 (International Standard Classification of Occupations)
  • Sectors: NACE Rev. 2 (Statistical Classification of Economic Activities)
  • Education: ISCED 2011 (International Standard Classification of Education)
  • Data period: Eurostat latest available year, HRDA 2022-2032 forecasts

Acknowledgements

This project adapts the visualisation approach from karpathy/jobs (a US BLS occupational treemap) for the Cyprus labour market using EU data sources and international classifications.

About

A research tool for visually exploring Cyprus labour market data from Eurostat and HRDA/AnAD. This is not a report, a paper, or a serious economic publication — it is a development tool for exploring Cyprus employment data visually.

Topics

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 72.5%
  • HTML 27.5%