Skip to content

victoriano/future-of-work-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Future of Work Data

This repository contains data and analysis tools for exploring job market data from ESCO (European Skills, Competences, Qualifications and Occupations) and O*NET (Occupational Information Network).

Data Sources

  • ESCO Dataset: European Skills, Competences, Qualifications and Occupations taxonomy, version 1.2.0 (website)
  • O*NET Dataset: Occupational Information Network database, version 29.2 (February 2025 Release) (website)

Project Structure

future-of-work-data/
│
├── data/
│   ├── raw/                  # Original CSV and Excel files
│   │   ├── esco/             # ESCO dataset files
│   │   │   └── 1.2.0/        # ESCO version 1.2.0 
│   │   └── onet/             # O*NET dataset files
│   │       └── 29.2/         # O*NET version 29.2
│   ├── duckdb/               # DuckDB databases
│   │   ├── esco_dataset_1.2.0.duckdb
│   │   └── onet_dataset_29.2.duckdb
│   └── derived/              # Derived datasets from SQL queries
│
├── src/                      # Source code
│   ├── etl/                  # ETL scripts for data processing
│   └── utils/                # Utility functions
│
└── sql/                      # SQL queries
    ├── esco/                 # ESCO-specific queries
    ├── onet/                 # O*NET-specific queries
    ├── crosswalk/            # Queries linking ESCO and O*NET
    └── views/                # Python scripts with SQL queries

Install for Development

Clone the repository:

git clone https://github.com/victoriano/future-of-work-data.git
cd future-of-work-data

Setup environment with uv:

uv sync --all-groups --all-extras

The --all-groups option will install development and docs dependencies (e.g. linters etc.), and the --all-extras option optional dependencies such as notebook support.

Optional: register the uv environment as a notebook kernel:

uv run ipython kernel install --user --env VIRTUAL_ENV $(pwd)/.venv --name=fow

This will let you select the kernel fow associated with this environment in Jupyter or VS Code notebooks. You can replace "fow" with a kernel name of your choice.

Data Processing

Converting to DuckDB

The raw data is converted to DuckDB databases for efficient querying:

# Convert ESCO dataset to DuckDB
python -m src.etl.convert_esco_to_duckdb

# Convert O*NET dataset to DuckDB
python -m src.etl.convert_onet_to_duckdb

Usage Examples

Query the data with SQL

import duckdb

# Connect to the databases
esco_con = duckdb.connect('data/duckdb/esco_dataset_1.2.0.duckdb')
onet_con = duckdb.connect('data/duckdb/onet_dataset_29.2.duckdb')

# Example ESCO query
esco_occupations = esco_con.execute("SELECT * FROM occupations_en LIMIT 10").fetchdf()

# Example O*NET query 
onet_occupations = onet_con.execute("SELECT * FROM occupation_data LIMIT 10").fetchdf()

INE DIRCE Viewer

This repository now includes a minimal React viewer for:

data/processed/ine_dirce/ine_dirce_workflows_enriched_top20.csv

The viewer lives in viewer and uses TanStack Table to render all columns with horizontal scroll, semantic ordering, and a modal with the full detail of the selected row.

Run it locally

Export the CSV to JSON for the frontend:

uv run python scripts/export_ine_dirce_for_viewer.py

Start the frontend:

cd viewer
bun install
bun run dev

Open the local URL printed by Vite, usually http://127.0.0.1:4173/.

For development you do not need to rebuild on every change: bun run dev starts Vite with HMR, so React code updates are reflected immediately in the browser. Reserve bun run build for production bundles only.

License

This project uses data from:

About

Analysis tools for ESCO (1.2.0) and O*NET (29.2) job market data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors