TS4PS

Time Series reconstruction for Power Systems

Installation instructions
External software
Data sources
Running the code

Installation instructions

Python setup

The recommended worklow is to use uv.

Install uv:
- For HPC users where uv is already installed, load uv with
```
module load uv
```
- To install your own uv distribution see installation instructions
Install dependencies
```
uv sync
```

Download RTE7000 dataset

⚠️ warning ⚠️
The D-GITT-RTE7000 dataset takes ~500GB of disk space and may take several hours to download. If running in an HPC environment, it is recommended to use a symlink to avoid straining network drives.

Install git lfs
```
git lfs install
```

Create directory structure and download RTE7000 dataset

mkdir data/D-GITT-RTE7000
git clone https://huggingface.co/datasets/OpenSynth/D-GITT-RTE7000-2021
git clone https://huggingface.co/datasets/OpenSynth/D-GITT-RTE7000-2022
git clone https://huggingface.co/datasets/OpenSynth/D-GITT-RTE7000-2023

Data sources

Data	Source	License
Network topology snapshots	D-GITT-RTE7000 (split across 3 datasets)	CC-BY-SA 4.0
Regional load & generation time series	eco2mix	Open License v2
Actual generation per unit	ENTSOE Transparency Platform	CC-BY 4.0 (See terms and conditions)
Shapefiles of France's regions	France GeoJSON	Open License v2
List of generators in France	Open Data Reseaux Energies (in French)	Open License v2
RTE substation geodata	Open Data Reseaux Energies (in French)	Open License v2

For convenience, geodata is included in this repository (see data/geodata directory). Substation geodata only includes the name and code of the region where each substation is located.

Running the code

Process RTE7000 data

The first step of the data procedure is to process the 300,000-ish XIIDM files into I/O efficient, tabular-oriented format.

Create output directory structure

mkdir -p data/TS4PS-RTE7000/CSV/branch
mkdir -p data/TS4PS-RTE7000/CSV/bus
mkdir -p data/TS4PS-RTE7000/CSV/gen
mkdir -p data/TS4PS-RTE7000/CSV/load
mkdir -p data/TS4PS-RTE7000/CSV/sub
mkdir -p data/TS4PS-RTE7000/CSV/switch
mkdir -p data/TS4PS-RTE7000/CSV/vol
mkdir -p data/TS4PS-RTE7000/parquet/branch
mkdir -p data/TS4PS-RTE7000/parquet/bus
mkdir -p data/TS4PS-RTE7000/parquet/gen
mkdir -p data/TS4PS-RTE7000/parquet/load
mkdir -p data/TS4PS-RTE7000/parquet/sub
mkdir -p data/TS4PS-RTE7000/parquet/switch
mkdir -p data/TS4PS-RTE7000/parquet/vol

Extract network components from XIIDM to CSV files
- If you are on a SLURM cluster, update the charge account in slurm/extract_components.sbatch as appropriate and run
```
mkdir slurm/logs/RTE7k-extract
sbatch slurm/extract_components.sbatch
```
- If you are not running on a SLURM cluster, or want to process a small batch of files. To view the code's documentation, run
```
uv run src/ts4ps/datatools/rte7000/extract_components.py --help
```
  For instance, to process only the XIIDM files on Jan 1st, 2021, run
```
uv run src/ts4ps/datatools/rte7000/extract_components.py data/D-GITT-RTE7000 data/TS4PS-RTE7000 --datetime-begin 2021-01-01 --datetime-end 2021-01-02
```
Convert from CSV to Parquet files, which can be read with pyarrow

If you are on a SLURM cluster, update slurm/csv2parquet.sbatch as appropriate and run
```
mkdir slurm/logs/RTE7k-csv2pq
sbatch slurm/csv2parquet.sbatch
```

Download and process public time series data

Download and process regional time series

bash scripts/download_eco2mix_timeseries.sh
uv run src/ts4ps/datatools/eco2mix/process_regional_time_series.py data/TS4PS-RTE7000/eco2mix/regional_cons

Download and process generator information

bash scripts/download_generator_register.sh
uv run src/ts4ps/datatools/eco2mix/generator_register.py data/TS4PS-RTE7000/eco2mix/generators

Download and process actual generation for large units (>100MW)
1. Download ENTSOE's Actual Generation per Unit data. This yields 36 CSV files (one per month), which should be placed in data/TS4PS-RTE7000/entsoe/raw
2. Extract France-only data
```
mkdir data/TS4PS-RTE7000/entsoe/fr
chmod u+x scripts/clean_entsoe.sh  # ensure script is executable
bash scripts/clean_entsoe.sh
```
  ⚠️ Some of the raw CSV lines have corrupted entries, which trigger errors in pandas' CSV reader. Fortunately, no data corresponding to RTE generators is corrupted ⚠️
3. Consolidate France-only data
```
uv run src/ts4ps/datatools/entsoe.py data/TS4PS-RTE7000/entsoe
```
4. [Optional] Verify data integrity
```
md5sum data/TS4PS-RTE7000/entsoe/actual_generation_unit_output.csv
```
  should output ac7df11c9cff145e4c11f3b453a96244.
  
  ⚠️ The above signature may not be valid if changes were made to raw data. ⚠️

Geodata reconstruction

Network geodata information is reconstructed approximately by matching substations in the RTE7000 dataset to substations listed in the ODRE data.

Execute the geodata reconstruction script as follows:

uv run src/ts4ps/reconstruction/match_substations.py data/geodata data/TS4PS-RTE7000

which will export the reconstructed data into a CSV file located at data/TS4PS-RTE7000/substations_matching.csv.

⚠️⚠️ This reconstruction is heuristic, and its output should not be expected to be exact ⚠️⚠️

The exported CSV file has the following structure

id_rte7000	vnom_rte7000	id_eco2mix	vnom_eco2mix	how_matched	region_code	generators_rte7000
.CTLH	63.0	COTEL	90.0	score-based	44	"[["".CTLO3GROUP.1"", ""HYDRO""], ["".CTLO3GROUP.2"", ""HYDRO""]]"
.CTLO	63.0	Q.CRO	63.0	score-based	24	[]
.G.RO	225.0	G.ROU	225.0	score-based	52	[]
.NAVA	63.0	NAVA6	380.0	score-based	28	[]

where

id_rte7000 and id_eco2mix are the substation ID in the RTE7000 dataset and eco2mix data, respectively
vnom_rte7000 and vnom_eco2mix are the substation's maximum nominal voltage levels in the RTE7000 and eco2mix data, respectively, in kilovolts (kV).
how_matched describes the rationale for the matching
region_code is the INSEE code of the region where the substation is located. Missing values are replaced with 0.
generators_rte7000 denotes the generators (ID and fuel type) that are attached to the substation in the RTE7000 dataset. This information is stored in JSON format.

Online capacity vs actual regional totals

Run

uv run src/ts4ps/viz/capacity.py

and following the instructions on the terminal to open the Dash app in your browser

Load and generation disaggregation

Generation dispatches are disaggregated by region and fuel type, proportional to online capacity
```
sbatch slurm/disaggregate_generation.sbatch
```
This script will replace the "target_p" column in the generator parquet files.
Individual loads are disaggregated using pre-populated disaggregation weights
```
sbatch slurm/disaggregate_load.sbatch
```
This script will populate the "p0" column in the load parquet files.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
data/geodata		data/geodata
scripts		scripts
slurm		slurm
src/ts4ps		src/ts4ps
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TS4PS

Installation instructions

Python setup

Download RTE7000 dataset

Data sources

Running the code

Process RTE7000 data

Download and process public time series data

Geodata reconstruction

Online capacity vs actual regional totals

Load and generation disaggregation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TS4PS

Installation instructions

Python setup

Download RTE7000 dataset

Data sources

Running the code

Process RTE7000 data

Download and process public time series data

Geodata reconstruction

Online capacity vs actual regional totals

Load and generation disaggregation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages