Om/offline improvement#65
Conversation
…sure loop
- buildChildSchemaDDL / buildParentSchemaDDL emitted bare column names
('$name $sqlType'), so a Frappe child like `User Document Type`
(istable=1, fields `create` / `delete` / `cancel` / `submit`) produced
`CREATE TABLE ... create INTEGER ...` which SQLite rejects with
`near "create": syntax error`. Quote the column ref (`"$name"`),
including __is_local / __norm derivatives and the column ref inside
`CREATE INDEX ... ON $tableName("$col")`.
- OfflineRepository.ensureSchemaForClosure iterated `metas.entries` in
a bare for-loop; any single throw escaped to frappe_sdk.dart:1195 and
stranded every doctype queued after the failure without offline
schema. Wrap each iteration body in try/catch — failed doctype is
logged and kept out of `_ensuredTables` / `_metaCache` so a later
retry can re-attempt.
Tests:
- test/database/schema/reserved_word_columns_test.dart — User Document
Type-shaped meta against both DDL builders; asserts the string contract
and that SQLite actually executes the DDL.
- test/services/offline_repository_closure_resilience_test.dart —
end-to-end: reserved-word child first in iteration order, asserts
Member + Member Document Item tables still get created.
- Existing schema tests updated to the now-correct quoted form.
…n push
Two independent push-path bugs that left dependent ("tier-2") docs failing
with incorrect server_names and same-series INSERTs dying on deadlock.
1. Flag-only local-Link detection. The push path detected local Link refs
only via the `<field>__is_local` companion, which is set solely by the
interactive picker. Links populated by fetch_from / defaults /
programmatic prefill / back-references never got the flag, so a
UUID-valued Link was (a) invisible to the dependency scan -> raced its
parent's INSERT, and (b) sent raw to the server -> rejected as an unknown
name. UuidRewriter and PushEngine._collectLocalLinkValues now also treat
a value as local when it `looksLikeMobileUuid` (server names are never
v4-UUID-shaped), matching the existing read-path guard. Non-UUID names
are untouched.
2. Deadlock marked terminal. Concurrent same-naming-series INSERTs race on
`tabSeries` -> MySQL 1213 (QueryDeadlockError, HTTP 500), which surfaced
as a bare ApiException, escaped the retry loop, and was markFailed(UNKNOWN)
with no retry. Added a retryable DeadlockError (-> NETWORK), detected in
the send wrapper via isDeadlockApiException and retried with jittered
backoff. PushEngine._dispatchUnits now serializes same-doctype INSERTs
within a tier (cross-doctype + non-INSERT work stays parallel) to prevent
the storm; retry is the correctness floor, serialization the convergence.
Tests: UuidRewriter UUID-without-flag resolve/block; end-to-end no-flag
ordering + rewrite; deadlock retry-then-succeed and exhaust->NETWORK;
same-doctype serialization witness (max in-flight == 1); isDeadlockApiException
detection. Full suite green.
…ixes
Addresses the real findings from docs/pr36_sr_review_4.md, verified against
code. False positives and the harmful B4 LIKE-escape proposal are excluded;
H6 (migration write-lock) is intentionally left atomic.
Sync / data integrity:
- B1 push_engine: reentrancy guard on runOnce() (_running + _rerunRequested)
so concurrent triggers don't double-dispatch the outbox.
- B2 write_queue: buffer per-task outcomes, resolve completers only after the
outer transaction commits — callers no longer see pre-durable success.
- B3 cursor: serialize a completed-but-empty cursor as {complete:true} so
zero-row doctypes graduate to incremental instead of re-pulling every boot.
- H2 response_writeback: capture the positional child-fallback rows-affected
and warn on a double-miss instead of silently dropping the server_name
writeback (row would otherwise re-push next sync).
- H3 attachment_pipeline: persist upload result before markDone and skip
re-upload on retry — no duplicate File records.
- H5 frappe_sdk: coalesce _initialMetaAndDataSync via _metaSyncInFlight so
login()/initialize() cannot run concurrent meta syncs.
- H8 offline_repository: child-cascade and parent delete now rethrow real
errors (only no-such-table/column swallowed) — no partial delete reported
as success.
UI:
- B5 form_screen/offline_repository: hardDeleteLocalMirror after online delete
so the local docs__ row isn't orphaned.
- B6 base_field: restore FieldStyle showLabel/showDescription defaults to true.
Hygiene:
- M1 connectivity_watcher onError handler; M6 print -> debugPrint;
M7 drop RunFn from the public barrel; M8 bound the meta dependency.
Docs (release-2.0): unifiedResolver -> resolver, createDocument ->
saveDocument, remove MigrationBlockedScreen, single-store rewrite.
Full suite: 1311 tests pass; dart analyze and dart format clean.
PR #36 Round-4 Review — ReplyEvery one of the 28 findings was verified against Disposition tally (28):
✅ Fixed (17) — committed in
|
| ID | File | What was done |
|---|---|---|
| B1 | push_engine.dart |
runOnce() now guarded by _running + _rerunRequested (body moved to _drainOnce()). A bare _running flag would drop rows enqueued mid-drain, so the rerun flag re-drains once after a concurrent caller arrives. Test: push_engine_test "concurrent runOnce". |
| B2 | write_queue.dart |
Per-task outcomes are buffered in-txn; completers resolve after db.transaction(...) commits. On outer-txn failure, queued callers are failed — no pre-durable success signal. Test: write_queue_test "caller does not observe success when outer txn fails". |
| B3 | cursor.dart |
toJson() is now if (isNull && !complete) return null, so a completed-but-empty cursor serializes {complete:true}; pull_engine.dart persists it and the zero-row doctype graduates to incremental. Test: cursor_test "completed empty cursor serializes". |
| B5 | form_screen.dart, offline_repository.dart |
Added hardDeleteLocalMirror (best-effort, child cascade) called after the successful online API delete in both the form_screen online branch and the repo's !offlineMode.enabled branch, so the local docs__ row isn't orphaned. Provenance note: no hardDeleteDocument was ever removed from form_screen, and repository.hardDeleteDocument does not exist — the new method is hardDeleteLocalMirror. Tests: offline_repository_delete_cascade_test. |
| B6 | base_field.dart |
FieldStyle (the class is FieldStyle, not BaseFieldStyle) showLabel/showDescription reverted to default true; the internal FrappeFormBuilder passes explicit false where it intends to hide them. Note: a null style already rendered labels via the != false guard, so the "silent breaking change for all consumers" framing only applied to an explicitly-constructed style. Test: field_style_test. |
High
| ID | File | What was done |
|---|---|---|
| H2 | response_writeback.dart:152-167 |
The positional child-row fallback now assigns its rows-affected to updated, and a 0 double-miss emits a debugPrint naming the orphaned child instead of silently dropping its server_name writeback. We initially read this as benign; that was wrong — the fallback exists because mobile_uuid isn't always echoed and Frappe renumbers idx 0→1 (hence 0-based pos matching, comment 128-135), so a local row can miss both keys and re-push next sync. Low severity, but a real silent gap. Test: response_writeback_test "double-miss". |
| H3 | attachment_pipeline.dart, pending_attachment_dao.dart |
Upload result (file_url/name) is persisted via recordUpload before markDone; on retry, if a result is already recorded the upload is skipped → no duplicate File doctype record. (The reviewer's "separate try/catch" alone was insufficient — markFailed leaves the row pending and the next drain re-uploads; idempotency is the actual fix.) Test: attachment_pipeline_test "failed done-state write does not re-upload". |
| H5 | frappe_sdk.dart |
_initialMetaAndDataSync() now coalesces on a _metaSyncInFlight completer inside the method itself (body → _runInitialMetaAndDataSync()), so login() / verifyLoginOtp() / initialize() can't run concurrent meta syncs regardless of entry point. |
| H8 | offline_repository.dart |
The child-cascade loop and the outer parent delete now rethrow real errors and swallow only no such table / no such column (_isBenignSchemaAbsence). A genuine failure rolls back the whole transaction — no partial delete reported as success. Tests: offline_repository_delete_cascade_test (2). |
Medium / hygiene
| ID | File | What was done |
|---|---|---|
| M1 | connectivity_watcher.dart |
onError handler added to the stream subscription (logs + swallows plugin stream errors, e.g. Android-13 permission revocation). Test: connectivity_watcher_test. |
| M6 | frappe_sdk.dart |
Bare print(...) → debugPrint(...) on the production paths. |
| M7 | frappe_mobile_sdk.dart |
RunFn dropped from the public barrel show clause. |
| M8 | pubspec.yaml |
meta: any → meta: ">=1.9.0 <3.0.0". |
Docs (release-2.0)
| ID | File | What was done |
|---|---|---|
| D1 | breaking-changes / whats-new / migrating-from-1.x | sdk.unifiedResolver → sdk.resolver (the real getter). Note: there were 5 occurrences, not 4 — migrating-from-1.x.md had one the review missed. |
| D2 | whats-new.md | offlineRepository.createDocument(...) → saveDocument(...). |
| D4 | OFFLINE_FIRST.md | Removed the MigrationBlockedScreen section — the class doesn't exist in lib/ and isn't exported. |
| D5 | OFFLINE_FIRST.md | Rewrote the obsolete two-store narrative to the single-store (docs__<doctype>) design; the legacy documents table was dropped in the v2→v3 migration (app_database.dart:151). |
⚠️ B4 — real fact, but do NOT apply the proposed fix
filter_parser.dart:188 passes the LIKE value unescaped — but the %/_ wildcards are caller-supplied, not user input. The one construction site wraps the query itself: link_option_service.dart:347 builds '%$query%'. The proposed "escape %/_ + ESCAPE '\'" would escape those wrapper % and break every Link-picker search. Frappe's server-side LIKE doesn't escape these either, so behavior already matches the server. No change. (If literal _/% typed inside a search term ever causes loose matches, we'd escape only the inner term while preserving the wrapper — speculative, and not what was proposed.)
⏸️ H6 — real, accepted by design (no code change)
meta_migration.dart — the __norm backfill is already row-chunked (chunkSize = 500), but the whole migration (ALTERs + backfill) is intentionally one atomic transaction, which is why the write lock is held for its duration. Two tests pin that rollback guarantee (meta_migration_test.dart "partial __norm updates do not persist" and "wrapped in a transaction"). The reviewer's literal "batch in chunks of 200-500" is already implemented and does not by itself release the lock; only per-chunk commits would, at the cost of all-or-nothing atomicity. Given typical doctype row counts on the boot/migration path, we keep it atomic. (Option to split — schema ALTERs atomic, idempotent __norm backfill in separate batched commits afterward — is documented in the action plan if row counts ever grow.)
❌ False positives — no change (with evidence)
| ID | Claim | Why it doesn't hold |
|---|---|---|
| H1 | _eq(null,null) returns false → null fields overwrite server |
three_way_merge.dart:71 runs if (identical(a,b)) return true; before the null-bail (72). identical(null,null) == true, so _eq(null,null) is already true. |
| H4 | syncFromApi() removed no-throw contract → callers crash on result==null |
permission_service.dart:45 propagates by documented design (the timeout/maxRetries:0 boot fast-fail). The sole internal caller frappe_sdk.dart try/catches it and ignores the return value — no result==null caller exists. (If external consumers relied on the v1 null-return, that's a public-API note worth a changelog line, but the described crash does not occur.) |
| H7 | resolveConflict deletes outbox before applying snapshot |
Wrong file and reversed: it lives in sync_controller.dart:205 (not offline_repository), and applies the snapshot (applySingleDoc, 242) before markDone (243). Already correct. |
| H9 | nullable columns hard-cast → crash on NULL | outbox_row.dart:114-137 uses bare as String/as int only on the NOT-NULL columns (id, doctype, mobile_uuid, operation, state, created_at); every nullable column uses as String?/as int?. No cast can throw on a schema-valid row. |
| M2 | _drain() in finally masks the original error |
concurrency_pool.dart:52 delivers the error via completeError in the catch, before the finally/_drain (55). The error is already on the completer; _drain can't mask it. |
| M3 | dataForDepends evaluates a stale snapshot |
form_builder.dart:1287-1289 evaluates visibility against the live formValues (state.value + _formData); dataForDepends is derived from that live data, not a stale snapshot. |
| M4 | _forceExited set after c.complete() |
offline_transition_service.dart:185 sets _forceExited = true before c.complete() (187). The finding describes the reverse. |
| M5 | stream emits before DB write returns | session_user_service.dart:35-40 await _db.update(...) completes before _controller.add(...). Emit is correctly after persistence. |
| D3 | cancelInitialSync() documented but not implemented |
It exists at sync_controller.dart:150. The doc is correct. |
deepak-dhwani
left a comment
There was a problem hiding this comment.
Minor Observation (non-blocking)
sync_controller.dart (pre-existing) and offline_repository.hardDeleteLocalMirror both use print() + // ignore: avoid_print in best-effort catch blocks.
The pattern is intentional (visibility in release builds) but worth a project-wide sweep — a thin debugPrint-based wrapper that strips in release is cleaner
than accumulating lint suppressors.
Approved. Every round-4 finding is resolved, correctly rejected, or confirmed as a false positive, all verified in code at 4a52249. Ready to merge.
No description provided.