Skip to content

OSEMOSYS to NEMO converter v0.1#11

Open
IainDM wants to merge 22 commits into
sei-international:masterfrom
IainDM:main
Open

OSEMOSYS to NEMO converter v0.1#11
IainDM wants to merge 22 commits into
sei-international:masterfrom
IainDM:main

Conversation

@IainDM
Copy link
Copy Markdown

@IainDM IainDM commented Apr 27, 2026

@jwveysey first draft of a basic OSEMOSYS to NEMO converter. Takes sqllite or otoole OSEMOSYS files and converts to NEMO. Tests of basic functionality, including a round tripping for a standard NEMO dataset, included.

Note that this includes additional adjustments to the deps which allowed it to run on a machine without Gurobi/ GLPK solvers. Tests of those solvers should still run if the solvers are installed.

claude and others added 22 commits April 1, 2026 15:28
Add convert_osemosys() function that converts an OSeMOSYS SQLite scenario
database to NemoMod format. Handles dimension tables, time slice structure
mapping (SEASON/DAYTYPE/DAILYTIMEBRACKET -> TSGROUP1/TSGROUP2/LTsGroup),
compatible parameter copying, and schema transformations for parameters
where NemoMod adds dimensions (AvailabilityFactor, ReserveMargin,
REMinProductionTarget).

https://claude.ai/code/session_013zj3e8pUcko4g4LbtHvTHp
The function now accepts either an OSeMOSYS SQLite database or a directory
of otoole CSV files. When given a CSV directory, it runs
`otoole convert csv sqlite` to create a temporary SQLite database, then
proceeds with the conversion. Requires otoole on the system PATH and a
config_path argument pointing to otoole's config.yaml.

https://claude.ai/code/session_013zj3e8pUcko4g4LbtHvTHp
- Remove stale test/Manifest.toml (was generated for Julia 1.9.3,
  incompatible with the project's Julia 1.12 requirement)
- Add GLPK and HiGHS open-source solvers as dependencies to enable
  running the full open-source test suite

All 357 tests pass with Julia 1.12.5 (Cbc, GLPK, HiGHS solvers).

https://claude.ai/code/session_01F4MZzeRVPUCmwtzMbn4ZEf
- Add test/osemosys_converter_tests.jl with 109 tests covering:
  - Set conversion (7 dimension tables + STORAGE)
  - Time slicing (TSGROUP1/TSGROUP2/LTsGroup from SEASON/DAYTYPE)
  - Compatible parameter copying (37 tables with column renaming)
  - TradeRoute with flexible region column naming
  - AvailabilityFactor transformation (CapacityFactor + 3D expansion)
  - ReserveMargin transformation (fuel dimension addition)
  - RE target transformation (fuel dimension addition)
  - Defaults parameter passthrough
  - Edge cases: missing optional tables, case-insensitive names, empty tables
  - Error handling: invalid paths, CSV without config_path
- Include new tests from test/runtests.jl
- Fix UndefVarError bug in _convert_osemosys_timeslicing! where sort
  lambdas referenced loop variable `idx` before it was defined

https://claude.ai/code/session_01F4MZzeRVPUCmwtzMbn4ZEf
Critical correctness fixes:
- Add temporary unique indexes on NemoMod parameter tables so INSERT OR
  IGNORE actually deduplicates (tables only had auto-increment PK)
- Fix DaysInDayType multiplier for TSGROUP1/TSGROUP2 that summed across
  all years instead of averaging per-year values
- Remove hardcoded "ELC" fallback fuel; skip transformation when no
  fuels are defined instead of inserting a nonexistent fuel
- Remove fragile weekday/weekend heuristic for TSGROUP2 that assumed
  sort order semantics

Resource leak and error handling fixes:
- Always close source database (was only closed for temp CSV path)
- Clean up destination file on conversion error (ROLLBACK + delete)
- Guard against findfirst returning nothing in TradeRoute column
  detection

Bad practice fixes:
- Quote all SQL identifiers with _quote_id() to prevent injection and
  handle special characters in table/column names
- Resolve column names case-insensitively via _resolve_column() so
  otoole databases with lowercase columns work correctly
- Use prepared statements in compatible parameter copy loop
- Remove dead code branch in _read_osemosys_set

Robustness improvements:
- Validate required source tables exist (warn if missing)
- Check otoole is on PATH before shelling out
- Warn when overwriting existing destination file
- Log unrecognized source tables that were not converted

Test updates:
- Fix TSGROUP1 multiplier assertion (182.5/7 not 547.5/7)
- Tighten TSGROUP2 multiplier assertion to expected 7.0
- Tighten AvailabilityFactor to assert exactly 3 rows (not >= 3)
- Add test for TradeRoute with missing columns (graceful skip)
- Add test for lowercase column names in parameter tables

https://claude.ai/code/session_01No3LeaQ2Xxe9uLMCJLGYqD
Fix critical bugs and bad practices in OSeMOSYS converter
Hand-authored otoole-format CSV translation of test/storage_test.sqlite
under test/storage_test_otoole/, with a config.yaml and README documenting
the (small) schema differences. The README explains the three deliberate
adaptations: MODE_OF_OPERATION integerization (generate->1, store->2),
StorageFullLoadHours dropped (no OSeMOSYS equivalent), and the time-slicing
rewrite from NemoMod's TSGROUP1/TSGROUP2/LTsGroup hierarchy to the standard
SEASON/DAYTYPE/DAILYTIMEBRACKET + Conversion*/DaysInDayType/DaySplit form.

Also add a roundtrip @testset that loads the new CSV fixture into a
temporary OSeMOSYS-format SQLite database and runs convert_osemosys()
against it, asserting the resulting NemoMod database matches the original
storage_test.sqlite (modulo the documented adaptations). The CSVs are
loaded directly with Julia rather than via 'otoole convert csv sqlite' so
the roundtrip exercises the converter's data handling regardless of which
otoole version is installed (otoole 1.0+ removed the SQLite backend).
The CSV-input branch of `convert_osemosys` shelled out to
`otoole convert csv sqlite ...`, but no installable otoole release ever
exposed `sqlite` as a `convert` target — verified empirically against the
v0.8.6 and v0.11.0 git tags, and against the otoole stable docs which list
only ReadCsv/ReadDatafile/ReadExcel as read strategies. Every user trying
to convert a CSV directory hit a hard failure.

Replace the shellout with two private helpers in `src/osemosys_converter.jl`:

- `_parse_otoole_config` — parses config.yaml with YAML.jl into a
  normalized table -> schema map (set vs param, dtype, indices). Validates
  required keys and rejects unknown dtypes.
- `_load_csv_directory_to_sqlite!` — iterates over the parsed config,
  creates a SQLite table per entry with proper column types
  (TEXT/INTEGER/REAL derived from the otoole dtypes), and loads CSV rows
  with CSV.jl for full RFC 4180 parsing (quoted fields, embedded commas,
  escaping, BOMs). Strict validation: errors on orphaned CSV files,
  header mismatches (case-insensitive), and ragged rows. Cleans up the
  partial SQLite on error. Warns on empty/missing CSV files.

Add CSV (=0.10.16) and YAML (=0.4.16) as dependencies of NemoMod.

Test changes:

- Repurpose the existing storage_test_otoole roundtrip testset to call
  the production code path directly, instead of using a parallel test-only
  loader. Any future regression in the loader is now caught by the same
  58-assertion fixture test.
- Delete the now-obsolete `_load_otoole_csv_fixture` helper.
- Add a new `CSV directory loader edge cases` testset (4 subtests, 10 @test
  calls) covering: quoted fields with embedded commas, ragged rows (errors
  with row number and cleans up the partial SQLite), empty and header-only
  files (warned but tables created from config), and subdirectories plus
  non-CSV files in the input directory (silently ignored).

Verified: 184/184 assertions pass in the OSeMOSYS converter testset
(was 175 before; +10 new edge-case tests, -1 obsolete src-file cleanup
assertion). The "otoole CSV is empty" warning fires correctly during
the empty-file edge case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous commit (5234815) hard-required CPLEX, Gurobi, Mosek,
MosekTools and Xpress as test deps and dropped HiGHS, so `Pkg.test`
aborted at precompile time on any machine without those proprietary
binaries -- before the per-solver `try using X / catch` skip guards
could run.

- test/Project.toml: drop the five commercial solvers, add HiGHS back.
  Devs with licenses can `Pkg.add` them into the test env to enable the
  matching testset.
- Project.toml: drop HiGHS from the library's runtime deps -- it isn't
  used in src/, only by test/highs_tests.jl and the JuMP direct-mode
  testset, so it belongs only in test deps.
- test/Manifest.toml: delete (libraries shouldn't pin transitive deps
  for every consumer) and add to .gitignore.
- Manifest.toml: regenerated by Pkg.resolve after the Project changes.
- test/runtests.jl: update the file-description docstring to reflect
  the new "free solvers default, commercial added manually" model.

Verified: Pkg.test() exits 0 locally on Julia 1.12.5 with no
commercial solvers installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make test suite runnable without commercial solver licenses
Pkg.test() was intermittently aborting with `glp_free: memory allocation
error` / SIGABRT after all GLPK tests passed. The crash originated in the
"Writing optimization problem for a scenario" test (Cbc): NemoMod builds
constraints in @async tasks, and when one of those tasks triggered GC on a
non-main thread, Julia's multi-threaded GC ran a still-pending GLPK
Optimizer finalizer from the earlier GLPK testset, corrupting GLPK's
internal memory pool.

Flush pending solver finalizers on the main thread with two full-GC passes
after the GLPK tests and again at the end of the "Solving a scenario"
testset. Two passes are needed because finalizers scheduled by the first
collection don't run until the second.
Fix flaky GLPK finalizer crash during Cbc test
- Validate required OSeMOSYS dimension tables (REGION, TECHNOLOGY, FUEL,
  YEAR, TIMESLICE, MODE_OF_OPERATION) before creating the destination
  NemoMod database, and raise an ErrorException instead of logging a
  warning. A malformed source no longer leaves a half-baked NemoMod DB
  on disk. Resolves the two existing #TODOs in convert_osemosys.
- Remove the unreachable two-REGION-column branch in
  _copy_osemosys_traderoute!: SQLite rejects duplicate column names
  case-insensitively at CREATE TABLE, so only the REGION/rr and
  REGION/REGION2 conventions are reachable.
- Wrap _load_csv_directory_to_sqlite! in a single BEGIN/COMMIT
  transaction. SQLite's per-statement autocommit was making bulk CSV
  loads orders of magnitude slower than necessary; this matches the
  pattern used by convert_osemosys and createnemodb.
- Add a regression test asserting that a source DB missing TECHNOLOGY
  throws and leaves no destination file behind.

https://claude.ai/code/session_01GXYYPX8SibbfoMYiU1TsqA
Improve OSeMOSYS converter validation and performance
@IainDM IainDM marked this pull request as ready for review April 27, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants