Skip to content

Standardize chunking policy for bulk APIs and IN-clause lookups #132

@pesap

Description

@pesap

Problem

Chunking is currently handled in two different ways:

  1. User-facing chunksize params for write batching (for example in bulk insert APIs).
  2. Method-local internal constants like CHUNK = 900 for safe IN (...) query parameter binding.

Both are valid, but behavior and naming are not standardized, which makes reviews and maintenance harder.

Why this matters

  • Inconsistent patterns across methods increase review friction.
  • Safe bind limits can be missed in new bulk paths.
  • It is not obvious when a knob is user-facing vs internal safety.

Proposal

Define and document a single policy:

  • Keep chunksize as a per-method user-facing write batching parameter.
  • Keep bind-safety chunking internal (not user-facing).
  • Introduce a shared internal constant/helper for IN (...) chunking (instead of ad-hoc local constants).
  • Update affected methods to follow the same pattern.

Candidate code paths

  • src/plexosdb/db.py (get_memberships_system and other IN (...) lookups)
  • src/plexosdb/utils.py (insert_property_values membership-name lookup)
  • New/ongoing bulk APIs (for example add_attributes_from_records)

Acceptance criteria

  • Policy documented in developer docs/contributing notes
  • Shared internal helper or constant introduced
  • Existing bulk/query methods aligned
  • Tests cover large inputs across chunk boundaries

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions