Use raw SQLite executemany for bulk database writes by cmutel · Pull Request #268 · brightway-lca/brightway2-data

cmutel · 2026-05-13T17:57:32Z

Summary

Extracted from #255 by Raphael Jolivet — isolating just the executemany bulk-insert optimization without the schema normalization, pickle→JSON, or drop_metadata changes from that PR.

Adds insert_many_activities() and insert_many_exchanges() to schema.py, each using cursor.executemany() on the raw SQLite connection to bypass Peewee ORM overhead per row
_efficient_write_dataset() no longer flushes batches of 125 mid-loop (that limit existed because Peewee's insert_many builds one large INSERT ... VALUES (?, ?...), (?, ?...) statement that hits SQLite's 999-variable limit; executemany uses a single prepared statement executed repeatedly, so no such limit applies)
_efficient_write_many_data() collects all datasets in one pass, then calls the two new functions once

Co-authored-by: Raphael Jolivet contact@raphael-jolivet.name

Test plan

All 178 existing tests pass
Manual benchmark against ecoinvent import to verify speedup

Replace Peewee ORM insert_many() calls in _efficient_write_many_data with raw cursor.executemany(), bypassing per-row ORM overhead. The old approach batched in groups of 125 to stay under SQLite's 999-variable limit; executemany uses a single prepared statement executed repeatedly, so no such limit applies and the full dataset can be flushed in one call. New functions insert_many_activities() and insert_many_exchanges() in schema.py handle the raw insertion. _efficient_write_dataset() no longer returns lists or flushes mid-loop; _efficient_write_many_data() calls the new functions once after iterating all datasets. Co-authored-by: Raphael Jolivet <contact@raphael-jolivet.name>

SQLite's raw executemany cannot bind Python tuples to TEXT columns, but locations like ("foo", "bar") are valid in brightway. Coerce location to str() before binding, matching Peewee's TextField behavior.

* Release 4.7 Add changelog entry for 4.7 release. * Add #268 to 4.7 changelog * Add #270 to 4.7 changelog

cmutel mentioned this pull request May 13, 2026

Improve import time and diskspace #255

Open

cmutel added 2 commits May 13, 2026 20:10

Prefix insert helpers with _ and comment on pickle protocol

f888a4a

Fix tuple location binding in _insert_many_activities

a53eddd

SQLite's raw executemany cannot bind Python tuples to TEXT columns, but locations like ("foo", "bar") are valid in brightway. Coerce location to str() before binding, matching Peewee's TextField behavior.

cmutel merged commit 106d627 into main May 13, 2026
9 checks passed

cmutel deleted the feature/executemany-writes branch May 13, 2026 21:10

cmutel added a commit that referenced this pull request May 13, 2026

Add #268 to 4.7 changelog

a2ca52d

cmutel mentioned this pull request May 13, 2026

Release 4.7 #267

Merged

2 tasks

cmutel added a commit that referenced this pull request May 14, 2026

Add #268 to 4.7 changelog

724b182

cmutel added a commit that referenced this pull request May 14, 2026

Release 4.7 (#267)

f699943

* Release 4.7 Add changelog entry for 4.7 release. * Add #268 to 4.7 changelog * Add #270 to 4.7 changelog

cmutel self-assigned this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use raw SQLite executemany for bulk database writes#268

Use raw SQLite executemany for bulk database writes#268
cmutel merged 3 commits into
mainfrom
feature/executemany-writes

cmutel commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cmutel commented May 13, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant