This project automates the conversion of modeling results (CSV/Excel) from various energy system models into the pyam format. The goal is to standardize and efficiently prepare data for a data explorer or further analysis.
amigdala-variable-converter/
│
├── konverter/
│ ├── 1_lookup_files.py # Conversion logic for the PRISM model (example)
│ └── 1b_convert_regions_to_iso.py # Other model converters
│ └── 2_mapping_utils.py # Other model converters
│ └── 3_convert_to_yaml # convert variable name, description and unit from CSV to yaml format
├── input/ # Input files (CSV/Excel, structured by model)
│ └── ... # currently containing files from Amigdala Poc1 and Poc2 runs
│ └── Variableninfo #contains infos as CSV that should be converted to yaml file
│
├── output/ # Output files in pyam format
│ └── error_log.txt # file, which will always be created after running the mapping-utils
│ └── ... # all the converted pyam & yaml files
│
├── dictionary_dataexplorer_variables_translation.xlsm # Central mapping file/dictionary for variable names, regions, scenarios, models etc. Can be found on TNO sharepoint
├── overview_files_variables.txt # List of files and variables which are to be converted
└── README.md # This file
-
clone repository, install python, create virtual environment
git clone https://github.com/ddt-dechema/amigdala-variable-converter.git
useful links: https://code.visualstudio.com/docs/python/python-tutorial
python -m venv venv # Activate (Windows) venv\Scripts\activate # Activate (Mac/Linux) source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt # or manually: pip install pandas openpyxl pip install ...Do not forget to fetch:
- model results file from TNO sharepoint
- The central dictionary or mapping file (
dictionary_dataexplorer_variables_translation.xlsm). This contains the mapping of original variables to pyam variables, as well as harmonized names for regions, scenarios etc. for all models. - please note that within this script, the file is renamed to "...-local.xlsm" to distinguish it from the one on the sharepoint. - The latest version of this file can be found on the TNO sharepoint
-
Edit the config.yaml
- Update the information in the config.yaml file:
- Where are the model results stored?
- What files should be post-processed? Please use the exemplary file as a basis.
You can usethe 1_lookup_files.pyto get an overview of files relevant for processing - Where is the dictionary saved at?
Here you should download and use the latest version on the sharepoint.
- Update the information in the config.yaml file:
-
Check for files to convert
-
First, the python script should be run which searches for all files to be converted.
python konverter\1_lookup_files.py`
-
This script looks in every sub-folder defined in the
config-file for.xslxand.csv-files -
if you would only like to include the files from a specific PoC run, indicate the corresponding folder
-
Each model runs should be saved in a subfolder.
-
The output of this python scripts lists the following information in the resulting file
overview_files_unsorted.xlsx.
They are almost already in the format necessary for the variables mapping-Excel file and contain these information:- model name
- folder name
- file name
- sheet name
- headers of the columns.
For the following processing, the files must contain information about:
variable names,region,year,unit,value
Please note, that some model output files contain multiple sheets, although not all of them are relevant.
UPDATE
- This script now looks for most of the required information automatically, such as
scenario,region,year,valueandunit. - Common names are stored within the code directly (lines 20-24)
- These information should then be put into the
overview_files.xlsxfile. - Please note that in case of the variable names, it might be possible that they are distributed over multiple columns, each with different names.
In that case the column names containing the variables must be concatenated with "|", e.g.:
Variable|Sector|Subsector|Carrier.
-
-
Run the conversion script
- Run the conversion script with:
python konverter/2_mapping_utils.py
- The script reads the input file(s), uses the dictionary (mapping file), and generates a pyam-compatible Excel file in the
output/folder for each listed excel/csv-file. - The first time this script runs, it might find some
variableswhich are not listed in the dictionary yet, possibly alsoregions,scenarioormodel names.
These are listed in the terminal and theerror_log.txtin theoutput/folder. - These information should be discussed bilaterally with the model owners and then updated in the dictionary file.
- Re-run the conversion with the updated dictionary file until there are no more errors.
- All converted files are stored in the
/output-folder namedpyam_MODELNAME_original-filename.xlsx.
NOTE : If there are multiple files for a model, they are aggregated into a single file, because this is required for uploading to the data explorer.
- Run the conversion script with:
-
Run the script to harmonize update region naming
- If necessary and some regions are named differently, e.g. with 2 or 3 letter codes or with other abbreviatons.
Use the following script to harmonize them:python konverter/1b_convert_regions_to_iso.py
- This script uses the
country_converterlibrary to convert the regions to full names. Custom region names are stored directly in a pythondict. - This script will convert the regions in the
dictionary_dataexplorer_variables_translation.xlsmfile the sheetregionsand create a new filedictionary_dataexplorer_variables_translation-local_regions_fullname.xlsxfile. - The information can then be used to update the region list in the dictionary file.
- If necessary and some regions are named differently, e.g. with 2 or 3 letter codes or with other abbreviatons.
-
convert new variables in yaml
- If necessary run conversion of infos on new variables, including name, description and unit, from CSV to yaml format, which is ready to implement in IIASA workflow
- save in the CSV
yaml_update, which is in the input folder invariable_infothe infos variable names, descriptions and units which should be converted to the yaml file with IAMC naming conventions. Use the fileyaml_update_templateas a template. - Run the conversion script with:
python konverter/3_import_csv.py
- The script reads the input folder from folder
variable_infothe CSVyaml_updateand it is converted to yaml file for the upload in the IIASA workflow. - the created yaml is saved as
outfile
- The mapping file is the central place for all variable, unit, and metadata harmonization. Changes are made here and immediately reflected in the conversion.
- The script is designed to handle both cases:
- Variables as columns (wide format)
- Variables as values in a column (long format, e.g., "sortingstream")
- Each model can have its own conversion script, or you can generalize the logic for batch processing.
- The project structure is designed to be easily extendable (add more models, more mappings, etc.).
- Check the variable lists and update the amigdala-workflow repo and inform IIASA via Pull request
- upload the data to the data explorer
The resulting Excel/CSV file will have the following structure:
Each row represents a unique combination of model, scenario, region, variable, and unit, with yearly data as columns:
| model | scenario | region | variable | unit | 2020 | 2025 | 2030 | ... |
|---|---|---|---|---|---|---|---|---|
| PRISM | 1. W2.4-EU net0 | AUT | Recycling | t | 247411.99 | 54031.48 | 55913.79 | ... |
This script was written by DDT (DECHEMA) for the AMIGDALA project. The AMIGDALA project is funded by the European Union under the grant agreement 101138534.
