OSEMOSYS to NEMO converter v0.1#11
Open
IainDM wants to merge 22 commits into
Open
Conversation
Add convert_osemosys() function that converts an OSeMOSYS SQLite scenario database to NemoMod format. Handles dimension tables, time slice structure mapping (SEASON/DAYTYPE/DAILYTIMEBRACKET -> TSGROUP1/TSGROUP2/LTsGroup), compatible parameter copying, and schema transformations for parameters where NemoMod adds dimensions (AvailabilityFactor, ReserveMargin, REMinProductionTarget). https://claude.ai/code/session_013zj3e8pUcko4g4LbtHvTHp
The function now accepts either an OSeMOSYS SQLite database or a directory of otoole CSV files. When given a CSV directory, it runs `otoole convert csv sqlite` to create a temporary SQLite database, then proceeds with the conversion. Requires otoole on the system PATH and a config_path argument pointing to otoole's config.yaml. https://claude.ai/code/session_013zj3e8pUcko4g4LbtHvTHp
Claude/analyze codebase 4l r9j
- Remove stale test/Manifest.toml (was generated for Julia 1.9.3, incompatible with the project's Julia 1.12 requirement) - Add GLPK and HiGHS open-source solvers as dependencies to enable running the full open-source test suite All 357 tests pass with Julia 1.12.5 (Cbc, GLPK, HiGHS solvers). https://claude.ai/code/session_01F4MZzeRVPUCmwtzMbn4ZEf
- Add test/osemosys_converter_tests.jl with 109 tests covering: - Set conversion (7 dimension tables + STORAGE) - Time slicing (TSGROUP1/TSGROUP2/LTsGroup from SEASON/DAYTYPE) - Compatible parameter copying (37 tables with column renaming) - TradeRoute with flexible region column naming - AvailabilityFactor transformation (CapacityFactor + 3D expansion) - ReserveMargin transformation (fuel dimension addition) - RE target transformation (fuel dimension addition) - Defaults parameter passthrough - Edge cases: missing optional tables, case-insensitive names, empty tables - Error handling: invalid paths, CSV without config_path - Include new tests from test/runtests.jl - Fix UndefVarError bug in _convert_osemosys_timeslicing! where sort lambdas referenced loop variable `idx` before it was defined https://claude.ai/code/session_01F4MZzeRVPUCmwtzMbn4ZEf
Claude/julia debug setup ygq7o
Critical correctness fixes: - Add temporary unique indexes on NemoMod parameter tables so INSERT OR IGNORE actually deduplicates (tables only had auto-increment PK) - Fix DaysInDayType multiplier for TSGROUP1/TSGROUP2 that summed across all years instead of averaging per-year values - Remove hardcoded "ELC" fallback fuel; skip transformation when no fuels are defined instead of inserting a nonexistent fuel - Remove fragile weekday/weekend heuristic for TSGROUP2 that assumed sort order semantics Resource leak and error handling fixes: - Always close source database (was only closed for temp CSV path) - Clean up destination file on conversion error (ROLLBACK + delete) - Guard against findfirst returning nothing in TradeRoute column detection Bad practice fixes: - Quote all SQL identifiers with _quote_id() to prevent injection and handle special characters in table/column names - Resolve column names case-insensitively via _resolve_column() so otoole databases with lowercase columns work correctly - Use prepared statements in compatible parameter copy loop - Remove dead code branch in _read_osemosys_set Robustness improvements: - Validate required source tables exist (warn if missing) - Check otoole is on PATH before shelling out - Warn when overwriting existing destination file - Log unrecognized source tables that were not converted Test updates: - Fix TSGROUP1 multiplier assertion (182.5/7 not 547.5/7) - Tighten TSGROUP2 multiplier assertion to expected 7.0 - Tighten AvailabilityFactor to assert exactly 3 rows (not >= 3) - Add test for TradeRoute with missing columns (graceful skip) - Add test for lowercase column names in parameter tables https://claude.ai/code/session_01No3LeaQ2Xxe9uLMCJLGYqD
Fix critical bugs and bad practices in OSeMOSYS converter
Hand-authored otoole-format CSV translation of test/storage_test.sqlite under test/storage_test_otoole/, with a config.yaml and README documenting the (small) schema differences. The README explains the three deliberate adaptations: MODE_OF_OPERATION integerization (generate->1, store->2), StorageFullLoadHours dropped (no OSeMOSYS equivalent), and the time-slicing rewrite from NemoMod's TSGROUP1/TSGROUP2/LTsGroup hierarchy to the standard SEASON/DAYTYPE/DAILYTIMEBRACKET + Conversion*/DaysInDayType/DaySplit form. Also add a roundtrip @testset that loads the new CSV fixture into a temporary OSeMOSYS-format SQLite database and runs convert_osemosys() against it, asserting the resulting NemoMod database matches the original storage_test.sqlite (modulo the documented adaptations). The CSVs are loaded directly with Julia rather than via 'otoole convert csv sqlite' so the roundtrip exercises the converter's data handling regardless of which otoole version is installed (otoole 1.0+ removed the SQLite backend).
The CSV-input branch of `convert_osemosys` shelled out to `otoole convert csv sqlite ...`, but no installable otoole release ever exposed `sqlite` as a `convert` target — verified empirically against the v0.8.6 and v0.11.0 git tags, and against the otoole stable docs which list only ReadCsv/ReadDatafile/ReadExcel as read strategies. Every user trying to convert a CSV directory hit a hard failure. Replace the shellout with two private helpers in `src/osemosys_converter.jl`: - `_parse_otoole_config` — parses config.yaml with YAML.jl into a normalized table -> schema map (set vs param, dtype, indices). Validates required keys and rejects unknown dtypes. - `_load_csv_directory_to_sqlite!` — iterates over the parsed config, creates a SQLite table per entry with proper column types (TEXT/INTEGER/REAL derived from the otoole dtypes), and loads CSV rows with CSV.jl for full RFC 4180 parsing (quoted fields, embedded commas, escaping, BOMs). Strict validation: errors on orphaned CSV files, header mismatches (case-insensitive), and ragged rows. Cleans up the partial SQLite on error. Warns on empty/missing CSV files. Add CSV (=0.10.16) and YAML (=0.4.16) as dependencies of NemoMod. Test changes: - Repurpose the existing storage_test_otoole roundtrip testset to call the production code path directly, instead of using a parallel test-only loader. Any future regression in the loader is now caught by the same 58-assertion fixture test. - Delete the now-obsolete `_load_otoole_csv_fixture` helper. - Add a new `CSV directory loader edge cases` testset (4 subtests, 10 @test calls) covering: quoted fields with embedded commas, ragged rows (errors with row number and cleans up the partial SQLite), empty and header-only files (warned but tables created from config), and subdirectories plus non-CSV files in the input directory (silently ignored). Verified: 184/184 assertions pass in the OSeMOSYS converter testset (was 175 before; +10 new edge-case tests, -1 obsolete src-file cleanup assertion). The "otoole CSV is empty" warning fires correctly during the empty-file edge case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Claude/osemosys storage dataset 0 rw gb
The previous commit (5234815) hard-required CPLEX, Gurobi, Mosek, MosekTools and Xpress as test deps and dropped HiGHS, so `Pkg.test` aborted at precompile time on any machine without those proprietary binaries -- before the per-solver `try using X / catch` skip guards could run. - test/Project.toml: drop the five commercial solvers, add HiGHS back. Devs with licenses can `Pkg.add` them into the test env to enable the matching testset. - Project.toml: drop HiGHS from the library's runtime deps -- it isn't used in src/, only by test/highs_tests.jl and the JuMP direct-mode testset, so it belongs only in test deps. - test/Manifest.toml: delete (libraries shouldn't pin transitive deps for every consumer) and add to .gitignore. - Manifest.toml: regenerated by Pkg.resolve after the Project changes. - test/runtests.jl: update the file-description docstring to reflect the new "free solvers default, commercial added manually" model. Verified: Pkg.test() exits 0 locally on Julia 1.12.5 with no commercial solvers installed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make test suite runnable without commercial solver licenses
Pkg.test() was intermittently aborting with `glp_free: memory allocation error` / SIGABRT after all GLPK tests passed. The crash originated in the "Writing optimization problem for a scenario" test (Cbc): NemoMod builds constraints in @async tasks, and when one of those tasks triggered GC on a non-main thread, Julia's multi-threaded GC ran a still-pending GLPK Optimizer finalizer from the earlier GLPK testset, corrupting GLPK's internal memory pool. Flush pending solver finalizers on the main thread with two full-GC passes after the GLPK tests and again at the end of the "Solving a scenario" testset. Two passes are needed because finalizers scheduled by the first collection don't run until the second.
Fix flaky GLPK finalizer crash during Cbc test
- Validate required OSeMOSYS dimension tables (REGION, TECHNOLOGY, FUEL, YEAR, TIMESLICE, MODE_OF_OPERATION) before creating the destination NemoMod database, and raise an ErrorException instead of logging a warning. A malformed source no longer leaves a half-baked NemoMod DB on disk. Resolves the two existing #TODOs in convert_osemosys. - Remove the unreachable two-REGION-column branch in _copy_osemosys_traderoute!: SQLite rejects duplicate column names case-insensitively at CREATE TABLE, so only the REGION/rr and REGION/REGION2 conventions are reachable. - Wrap _load_csv_directory_to_sqlite! in a single BEGIN/COMMIT transaction. SQLite's per-statement autocommit was making bulk CSV loads orders of magnitude slower than necessary; this matches the pattern used by convert_osemosys and createnemodb. - Add a regression test asserting that a source DB missing TECHNOLOGY throws and leaves no destination file behind. https://claude.ai/code/session_01GXYYPX8SibbfoMYiU1TsqA
Improve OSeMOSYS converter validation and performance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@jwveysey first draft of a basic OSEMOSYS to NEMO converter. Takes sqllite or otoole OSEMOSYS files and converts to NEMO. Tests of basic functionality, including a round tripping for a standard NEMO dataset, included.
Note that this includes additional adjustments to the deps which allowed it to run on a machine without Gurobi/ GLPK solvers. Tests of those solvers should still run if the solvers are installed.