feat(offline-first): SDK 2.0 — pull, push, write-path, and UX surface #36
Conversation
Complete design for offline-first data layer covering per-doctype columnar schema, cursor-based pull engine, topologically-ordered push with mobile_uuid idempotency and UUID->server-name rewrite, unified read resolver with DB-is-source-of-truth semantics, Frappe-parity filter parser, attachment sub-pipeline, pull/push coordination gate, atomic logout wipe, and phased implementation plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop unused outbox.idempotency_key (mobile_uuid dedup sufficient) - Drop unused outbox.depends_on (tiers computed in-memory per run) - Add v1.1.0 -> v2 data migration algorithm (§12.1), covers docstatus transitions, deleted-row replay, orphaned-doctype handling, dataJson corruption fallback, preserves mobile_uuids - Add sync_status <-> outbox.state correspondence table (§5.2) - Define public types: QueryResult, Cursor, ConflictAction, OutboxRow, DeleteCascadePlan, SyncError, RowOrigin (§9.4) - Tighten retry-all priority matrix to key on (state, error_code) - Rename PERMISSION -> PERMISSION_DENIED, add TIMEOUT + UNKNOWN to error_code enum - Cascade-delete ordering: dependents get earlier created_at so tiers dispatch them before the parent's DELETE is retried - Document idempotency and dep-ordering choices in decisions log Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Remove hard dependency on a proprietary mobile_control Frappe app.
- SDK now targets stock Frappe v15+ using standard endpoints
(login, /api/resource/*, /api/method/frappe.client.*, upload_file,
GET DocType/<x> for meta-change detection).
- Idempotency rewritten as three mitigation layers (§5.7):
L1 autoname=field:mobile_uuid -> natural idempotency
L2 consumer before_insert hook -> zero-retry duplication
L3 client pre-retry GET check -> works against stock Frappe
SDK auto-selects per DocType; emits InitWarning if none apply.
- §10 rewritten: required stock-Frappe endpoints vs. optional
consumer server-app capabilities (advertised via
X-Mobile-Essentials-Version header).
- SDKConfig: drop mobileControlMinVersion; add
IdempotencyStrategy override.
- P4 throughput gates revised (§11.4) to reflect SQLite realities:
500 queued saves drained via max_inflight_network (2/4/8 tier) +
per-doctype serialized writes + 20-50 batch size,
0 duplicates + 0 lock errors + peak mem delta < 50 MB, plus a
mid-drain kill-relaunch resilience check. Old "100 concurrent"
framing was unrealistic for low-end Android.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Implement tests for response writeback handling in `response_writeback_test.dart` - Add tests for sync state management in `sync_state_test.dart` - Create tests for three-way merge logic in `three_way_merge_test.dart` - Introduce tier computation tests in `tier_computer_test.dart` - Add UUID rewriting tests in `uuid_rewriter_test.dart` - Implement UI tests for delete cascade prompts in `delete_cascade_prompt_test.dart` - Add tests for document list filter chips in `document_list_filter_chip_test.dart` - Create tests for force logout confirmation dialog in `force_logout_confirm_test.dart` - Implement tests for logout guard dialog in `logout_guard_dialog_test.dart` - Add tests for sync errors screen in `sync_errors_screen_test.dart` - Create tests for sync progress screen in `sync_progress_screen_test.dart` - Implement tests for sync status bar in `sync_status_bar_test.dart`
…folding - LocalWriter mirrors saves into per-doctype docs__<doctype> tables; OfflineRepository delegates parent + child writes through it - SyncService.pullSync gains DoctypePullPhase, look-ahead pagination, tie-skip on resume, and bounded-concurrency pullSyncMany - DoctypeService.bulkGetWithChildren replaces N+1 per-name GETs with a single mobile_sync.get_docs_with_children call; falls back per-name on 404 (older mobile_control), bounded at 20 concurrent GETs - UnifiedResolver constructed and wired into LinkOptionService; closure builder fans out meta fetch level-by-level in parallel - pull_apply: server-advanced check before flagging conflict, so normal offline edits don't surface as conflicts on the next pull - system_tables: IF NOT EXISTS / OR IGNORE so onUpgrade re-entry after partial failure no longer raises duplicate-table errors - FrappeSDK: single connectivity probe at boot; offline launch skips bootstrap to avoid the ~4-minute HTTP timeout cascade - FormScreen: connectivity gate before save; treat serverId==null as INSERT (offline-created docs going online for the first time) - Tests: LocalWriter, OfflineRepository LocalWriter wiring, per-doctype repository, closure builder, link-option offline path, pull-apply server-advanced behaviour
Tripped flutter analyze --no-fatal-infos in CI. LocalWriter writes through raw sqflite transactions and never touched SchemaApplier.
4bf7f01 to
92977d4
Compare
Frappe v16's File controller (frappe/core/doctype/file/file.py:151)
rejects File insert with attached_to_doctype set but attached_to_name
empty/NULL. The pre-existing 'new-<doctype>' sentinel docname produced
orphaned File rows that attach_files_to_document could never relink
— regardless of v16's __temporary_name 60-min window, which is too
short for offline retries anyway.
SDK now uploads each pending attachment fully unattached (no dt, no
dn). Once the parent INSERT/UPDATE commits:
- Parent docs: Frappe's stock attach_files_to_document (registered
on *.on_update) walks the doc's Attach/Attach Image fields and
relinks unattached File rows by file_url.
- Child rows: in v16 child rows save via raw db_update() and skip
lifecycle hooks (frappe/model/document.py:613-648), so stock never
reaches them. The mobile_control hook 'relink_mobile_files' fills
that gap.
Schema delta on pending_attachments: adds top_parent_uuid +
top_parent_doctype columns (DAO-level non-null, nullable in schema
because SQLite ALTER ADD COLUMN can't add NOT NULL without DEFAULT)
and ix_attach_top_parent index, so the push pipeline can discover
all attachments for one outbox row (parent + children) in a single
query. Migration consolidated into a single v3 -> v4 step combining
the SIG-12 is_parent_with_children column work and the attachment
columns.
API:
- PendingAttachmentDao.enqueue() now requires topParentUuid +
topParentDoctype (set to parent's mobile_uuid for top-level
attaches, or to the parent's mobile_uuid even when the field
lives on a child row).
- findPendingForParent -> findPendingForTopParent.
- AttachmentPipeline.uploadPendingFor -> uploadPendingForTopParent.
- Upload call drops the doctype + docname args.
Tests: 495/495 pass. New tests cover the migration (forward +
idempotent re-run), DAO discovery across parent + children, and the
upload shape (asserts no doctype, no docname).
Live verified on Frappe v16.13: child-row relink smoke against
Website Slideshow -> Website Slideshow Item.image PASS.
Docs: doc/OFFLINE_FIRST.md gains an Attachments section; spec
2026-04-24-offline-first-sdk-design.md §5.3 + §4 schema updated;
implementation plan added under docs/superpowers/plans.
Frappe v16's File controller (frappe/core/doctype/file/file.py:151)
rejects File insert with attached_to_doctype set but attached_to_name
empty/NULL. The pre-existing 'new-<doctype>' sentinel docname produced
orphaned File rows that attach_files_to_document could never relink
— regardless of v16's __temporary_name 60-min window, which is too
short for offline retries anyway.
SDK now uploads each pending attachment fully unattached (no dt, no
dn). Once the parent INSERT/UPDATE commits:
- Parent docs: Frappe's stock attach_files_to_document (registered
on *.on_update) walks the doc's Attach/Attach Image fields and
relinks unattached File rows by file_url.
- Child rows: in v16 child rows save via raw db_update() and skip
lifecycle hooks (frappe/model/document.py:613-648), so stock never
reaches them. The mobile_control hook 'relink_mobile_files' fills
that gap.
Schema delta on pending_attachments: adds top_parent_uuid +
top_parent_doctype columns (DAO-level non-null, nullable in schema
because SQLite ALTER ADD COLUMN can't add NOT NULL without DEFAULT)
and ix_attach_top_parent index, so the push pipeline can discover
all attachments for one outbox row (parent + children) in a single
query. Migration consolidated into a single v3 -> v4 step combining
the SIG-12 is_parent_with_children column work and the attachment
columns.
API:
- PendingAttachmentDao.enqueue() now requires topParentUuid +
topParentDoctype (set to parent's mobile_uuid for top-level
attaches, or to the parent's mobile_uuid even when the field
lives on a child row).
- findPendingForParent -> findPendingForTopParent.
- AttachmentPipeline.uploadPendingFor -> uploadPendingForTopParent.
- Upload call drops the doctype + docname args.
Tests: 495/495 pass. New tests cover the migration (forward +
idempotent re-run), DAO discovery across parent + children, and the
upload shape (asserts no doctype, no docname).
Live verified on Frappe v16.13: child-row relink smoke against
Website Slideshow -> Website Slideshow Item.image PASS.
Docs: doc/OFFLINE_FIRST.md gains an Attachments section; spec
2026-04-24-offline-first-sdk-design.md §5.3 + §4 schema updated;
implementation plan added under docs/superpowers/plans.
…-1/9/12 fixes
Concurrency / coordination
- Replace SyncService's boolean _isSyncing flag with a SyncMutex
(single-occupant, try-only). Concurrent push/pull callers get
'Sync already in progress' instead of racing each other.
- SIG-2: PullEngine.run() returns the set of doctypes it deferred
because a push was active. SyncController re-runs a targeted pull
(RunPullForFn) over that subset after push completes — closing the
'pull skipped while pushing → stale local state' gap.
- SIG-1: per-task isolation via SQLite savepoints in the writeback
path so a single failing row doesn't roll back the whole batch.
- Add FetchSingleDocFn / ApplySingleDocFn typedefs so a single
server snapshot can flow through the same PullApply pipeline used
by full pulls (used by SyncController).
Schema / migrations
- SIG-12: persist is_parent_with_children on doctype_meta so
OfflineRepository.doctypesWithChildren survives cold start.
- SIG-9: tolerate conflicting cursor formats from older SyncService
cursor persistence.
- Idempotency tests for v2 -> v3 and v3 -> v4 migrations.
Tests
- sync_mutex_test, frappe_sdk_dispose_test (resource cleanup),
offline_repository_child_registry_test (SIG-12 cold-start path).
- Existing sync/pull/push/cursor/payload tests brought in line with
the new SyncController type signatures (RunPullFn now returns the
deferred set; new Fetch/Apply single-doc fns).
Tooling
- analysis_options.yaml: drop the '**/*.g.dart' exclude block (no
codegen remains in the SDK).
- frappe_sdk.dart: replace lambda-bound MetaService.getMeta wrappers
with tear-offs; collapse double-underscore unused params.
…offline_improvement
Adds the persistence layer for the offline-mode toggle: an OfflineMode value object, a single-row SdkMetaDao on sdk_meta, and a _persistOfflineFlagFromLogin helper wired into login, verifyLoginOtp, and _fetchUserInfoAndApply. Schema bumps v4 -> v5 with two new columns on sdk_meta (offline_enabled, offline_enabled_set_at). Zero behavior change in this phase — services do not yet branch on the flag. P2 wires the flag into UnifiedResolver / OfflineRepository / SyncService / LinkOptionService. See spec: docs/superpowers/specs/2026-05-01-offline-mode-toggle-design.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the persisted OfflineMode (added in P1) into the SDK's read/write paths. When offline_enabled = false at boot: - UnifiedResolver.resolve() short-circuits to a REST passthrough via client.doctype.list (extended to accept or_filters). - OfflineRepository.create/update/delete route to FrappeClient instead of writing the outbox; getDirtyDocuments returns empty. - SyncService.pushSync/pullSync/pullSyncMany/syncDoctype/getSyncStats return SyncResult.empty() — sync engine is fully no-op. - _initialMetaAndDataSync skips the closure pull (meta + permissions + translations still run; doc tables stay absent). FrappeSDK.initialize() reads sdk_meta.offline_enabled and resolves the session-bound mode through _resolveBootMode, which carries a conservative residue guard: if persisted=false but local docs__* / outbox / pending_attachments exist, the session stays offline anyway and prints a warning. P3 replaces this guard with a real drain/wipe transition. forTesting accepts an offlineMode parameter (default: offline) so existing tests keep their behaviour. See spec: docs/superpowers/specs/2026-05-01-offline-mode-toggle-design.md
Replaces P2's conservative residue guard with the real offline → online transition. When initialize() observes persisted offline_enabled=false AND on-disk residue exists (docs__* tables, outbox, or pending attachments), it invokes OfflineTransitionService.runDrainAndWipe() before _initialMetaAndDataSync. OfflineTransitionService: - Sealed state hierarchy: Idle / Draining / DrainFailed / WipingTables / Completed, emitted on a broadcast stream. - runDrainAndWipe() loops drain → check residue → either wipe + complete or park in DrainFailed and await retry()/forceExit(). - forceExit() drops tables unconditionally (A1 fallback) and exits the loop. - Drain runs against a transient SyncService constructed with OfflineMode(enabled: true) so its public methods actually push regardless of the session-bound mode. OfflineTransitionScreen widget exposed for consumer apps to mount above their router. PopScope(canPop: false) blocks the OS back button while the transition is active. AppDatabase.wipeOfflineDocumentTables() drops every docs__<doctype> and clears outbox / pending_attachments / link_options. Preserves doctype_meta, auth_tokens, doctype_permission, sdk_meta. FrappeSDK.dispose() now also disposes the OfflineTransitionService's broadcast stream controller. See spec: docs/superpowers/specs/2026-05-01-offline-mode-toggle-design.md
P3 awaited runDrainAndWipe inside initialize() before runApp mounted any UI. If drain hit DrainFailed, the user_action completer would never be completed (no UI to call retry/forceExit) and init blocked indefinitely on a frozen splash. Force-killing the app retried the same loop. Fix: kick off the transition with `unawaited(...)` from initialize(). Init returns; runApp mounts the widget tree; OfflineTransitionGuard overlays OfflineTransitionScreen for as long as the service is in any non-idle state. The user can now interact with retry/forceExit. Adds: - OfflineTransitionGuard widget — wraps a child and shows the screen whenever the SDK's offlineTransition stream emits Draining / WipingTables / DrainFailed. - runOfflineTransitionIfPending() — public method consumers can call from their own boot widget if they prefer foreground orchestration (Spec §10.4(a) path) instead of the unawaited init kick-off. - Top-level exports for OfflineMode, OfflineTransitionService and state classes, OfflineTransitionScreen, OfflineTransitionGuard. This addresses the design's §10.4(b) trade-off without forcing consumers into §10.4(a) — the guard pattern keeps the simple "block init" UX for the success path but unblocks the failure path.
Public-facing documentation for the server-driven offline_enabled toggle introduced in this release: - doc/OFFLINE_MODE_TOGGLE.md — feature overview, server-side setup, consumer integration via OfflineTransitionGuard, transition flow (offline → online drain/wipe and online → offline bootstrap), what stays local in each mode, schema migration notes, testing seam, known limitations. - doc/OFFLINE_FIRST.md — note + cross-reference at the top of the existing offline-first architecture doc, since most of it does not apply when the toggle is off. - README.md — one-line mention of the toggle in the Overview feature list and a link in the Further Documentation section. - CHANGELOG.md — 2.1.0 entry covering the new exports (OfflineMode / OfflineTransitionService + states / OfflineTransitionScreen / OfflineTransitionGuard), the schema bump v4 → v5, the additive constructor parameters on OfflineRepository / SyncService / UnifiedResolver / LinkOptionService, the new DoctypeService.list orFilters parameter, and upgrade notes. - pubspec.yaml version 1.1.0 → 2.1.0.
…parens
Move home screen, document list, and doctype list reads off the legacy
`documents` table onto UnifiedResolver, so both online (`offline_enabled
= false`) and offline modes display records. Wire `frappe.client.get_count`
for cheap chip badges, fix the dependency evaluator's handling of
parenthesised expressions, and absorb the user-staged auth + repo
follow-ups.
Resolver-driven reads
- `DoctypeService.count(doctype, {filters})` wraps `frappe.client.get_count`.
- `UnifiedResolver.count(doctype, {dirtyOnly})` branches on offlineMode:
online → server count (returns 0 for `dirtyOnly` since there is no
outbox in online mode); offline → `SELECT COUNT(*)` against
`docs__<doctype>`, dirtyOnly filters to dirty/sync_error/sync_blocked.
- `_onlinePassthrough` now requests `fields: ['*']` so list rows carry
every top-level column (matches the offline `docs__*` shape).
- `Document.fromResolverRow(doctype, row)` adapts both shapes to a
Document, falling back to `name` when `mobile_uuid` / `server_name`
are absent (online rows).
- `LinkOptionService._rowsToEntities` falls back to `row['name']` so
online passthrough rows surface as link options instead of being
silently filtered out.
- `FrappeSDK.resolver` getter exposes the existing instance.
- `MobileHomeScreen`, `DocumentListScreen`, `DoctypeListScreen` consume
the resolver; example app updated for the new constructor params.
depends_on parens fix
- `DependsOnEvaluator._splitOutsideBrackets` now tracks `( )` depth
alongside `[ ]`, and a new `_stripOuterParens` removes balanced
wrapping parens before evaluation. Without this, expressions like
`(A && B) || (C && D)` were split on the inner `&&` first, leaking
unmatched parens into the comparators and short-circuiting to false —
hiding entire sections (the SNF Household Survey "Presence Check"
block was a casualty).
Auth + repository follow-ups
- Drop the dead `AuthService.loginWithCredentials` path and route
`logout` through `mobile_auth.logout`.
- `AuthService.loadConfiguration` unwraps the `/api/v2/method/*`
`{data: ...}` envelope so role/permission hydration works on v2.
- `OfflineRepository._reconcileParentTableSchema` heals already-created
`docs__<doctype>` parent tables whose schema has drifted from the
current meta (e.g. a new `title_field` added server-side after first
pull, leaving the `<field>__norm` column missing).
Tests
- New paren-handling cases in `depends_on_evaluator_test.dart`,
including the verbatim Frappe `section_break_presence` regression.
- New `count()` group in `unified_resolver_test.dart` covering offline
+ dirtyOnly + missing-table + online short-circuit + missing-client
paths.
- New `test/models/document_test.dart` covering both row shapes and
`modified` parsing.
- `mobile_home_screen_test.dart` updated to await `resolver.count`
instead of the removed `repository.getDocumentsByDoctype` path.
…plete\$ stream, system_columns/sqlite_utils DRY helpers - Replace immutable _offlineMode field with OfflineModeNotifier threaded through OfflineRepository, SyncService, and UnifiedResolver so a post-login offline-enabled change is visible to all services without rebuilding them - Add syncComplete\$ broadcast stream on FrappeSDK; home screen subscribes to refresh document counts after each batch pull - Extract system_columns.dart as single source of truth for parent/child system column names (parent_schema, local_writer, pull_apply all import from here to prevent drift) - Extract sqliteTableExists() into sqlite_utils.dart so the sqlite_master query lives in one place - Add flipOfflineModeForTesting() seam on FrappeSDK for unit tests - New tests: OfflineModeNotifier sharing, offline flag propagation, upgrade pull trigger, SyncService/OfflineRepository notifier wiring
Initial sync drained doctypes serially: each pullSync call grabbed a global SyncMutex, ran one doctype to completion, released, then the next iteration started. Picker fetches scheduled while the loop ran silently no-opped because tryProtect saw the mutex held. Result: helper masters (State, District, Block, Village, ...) landed late and Link pickers opened during sync stayed empty forever. Three coordinated changes: 1. _runUpgradeClosurePull and forcePullAll now use SyncService. pullSyncMany(concurrency: 4) — one mutex hold for the whole closure instead of one-per-doctype. forcePullAll keeps a sequential cursor- clear pre-pass before the parallel batch. ~4x faster cold start. 2. LinkOptionService accepts a syncComplete$ stream wired from the SDK. _LinkFieldDropdownState subscribes; when its options are empty and a sync completes, it retries _loadOptions. The coordinator no longer caches empty results, so retries reach the resolver instead of returning a stale []. 3. SyncMutex grows a blocking protect() variant alongside tryProtect. SyncService.pullSyncWaiting uses it; the resolver's backgroundFetch now calls pullSyncWaiting so background-driven Link picker refreshes queue behind the batch instead of dropping silently. User-initiated pullSync callers (manual refresh, navigation, list-screen pull-to-refresh) keep the bail-fast tryProtect path so users still see "Sync already in progress" instead of hanging. Tradeoff: the mutex is held for the entire pullSyncMany batch rather than per-doctype, so user-triggered pushSync / pullSync during cold start now consistently wait the whole window instead of slipping in between doctypes. Push isn't lost — the outbox holds it. Cold start finishing 4x faster more than offsets the longer single hold. Tests: 6 new (4 SyncMutex.protect, 2 pullSyncMany short-circuits). Total suite 564 passing.
…efactor Database migration: - AppDatabase._version 6 -> 3. Single _migrateV2ToV3 step replaces the v3-v4-v5-v6 ladder; no device in the wild was at any of those intermediate versions. - Defensive duplicate-column guards via _safeAddColumn — applied only to ALTER TABLE ADD COLUMN. CREATE TABLE IF NOT EXISTS and DROP IF EXISTS unwrapped (already idempotent). - Singleton sdk_meta upsert (INSERT OR REPLACE) recovers from missing or corrupted rows. - AppDatabaseTestSeam exposes _onUpgrade / _version for migration tests. - Tolerant login-response parsing: missing field, null, string "true", and integer 1 all leave offline_enabled at 0. Sync engine refactor: - SyncEngineBuilder + PayloadSerializer + UuidPattern split out of SyncService for testability. - Outbox slimmed (drops payload / server_name / retry_count); per-doctype docs__* tables retain payload, error, and retry context. - Tombstone PullApply support; delegate-style SyncService surface. UI: - New SyncErrorBanner widget. - OfflineRepository reconcile + save split into focused methods. - LinkField gating uses uuid_pattern for local-vs-server identity check. Retire legacy: - documents table + DocumentDao + Document model removed. - migration_blocked_screen, v1_to_v2 + v3_to_v4 + v5 isolated tests deleted (their schema deltas are now part of _migrateV2ToV3). Tests: - 617 passing. - Adds migration_v2_to_v3, migration_reentrant, fresh-vs-upgraded PRAGMA equivalence, post-migration gate, end-to-end push engine, payload serializer, uuid pattern, sync_error_banner, and 4 login- response tolerance cases. pubspec: - 1.1.0 -> 2.0.0 (semver-major: public API broke). Spec: docs/superpowers/specs/2026-05-09-sdk-v2-v3-migration-design.md Plan: docs/superpowers/plans/2026-05-09-sdk-v2-v3-migration.md
…uctor API cleanup Add doc/release-2.0 suite (README, architecture, breaking-changes, migration, schema-migration, limitations, what's-new). Consolidate three duplicate ChildInfo structs into ChildTableInfo. Tighten constructor signatures for OfflineRepository, UnifiedResolver, and SyncService to named parameters with positional database. Extend test coverage across sync, services, concurrency, and UI layers.
…ety fix Resolved merge conflict in frappe_mobile_sdk.dart: kept link_option_entity.dart export (new in develop); dropped document_entity.dart export (deleted in 2.0 refactor). Updated link_field_test.dart to use LinkOptionService.withoutResolver() instead of the removed FrappeClient constructor.
PullEngine._runDoctype used `modified >= cursor.modified` (PullPageFetcher's only filter) with a while-true loop that breaks only on an empty page. When all rows in a page share the same modified timestamp — common for reference doctypes bulk-imported on the server — the advanced cursor equals the input cursor and the next request returns the same page forever, creating an infinite pull loop observed in device logs (Block, Volunteer Hiring Stage polled with limit_start=0, no delay, same cursor each time). Fix: capture (priorModified, priorName) before advancing the cursor; break when the new cursor matches, preventing the re-fetch loop. PullApply's UPSERT-by-server_name idempotency keeps applied rows correct. Also ships: - Updated PullPageFetcher doc comment: replaces the misleading "bounded re-fetch" claim with an explicit stall-hazard warning and pointer to PullEngine's guard. - Regression test: fetcher that always returns the same non-empty page; asserts exactly 2 fetches, rows applied, cursor persisted with complete:true. - AC4 follow-up: runUpgradeClosurePullForTesting now returns Future<Set<String>>; two new wiring tests verify the no-op path and that an active outbox push propagates the deferred set through _runUpgradeClosurePull → PullEngine.run(). Added injectPullEngineForTesting and overrideIsOnlineForTesting seams.
DocumentListScreen loads via _pullDocuments() in initState; the initialDocuments pre-population was the only caller of the now-removed method, so drop both the fetch and the parameter.
…ing throw - ResponseWriteback: early-exit on OutboxOperation.delete — hard-deletes parent row, child rows, and outbox entry so confirmed deletes don't linger as ghost records - FrappeSDK.logout: wrap clearAllData inside SyncMutex.protect() so a concurrent push cannot write to a table being dropped - FrappeSDK.dispose: drain _pendingDrain before closing syncStateNotifier / syncCompleteController to prevent StateError on late notifier writes - OfflineRepository.applyServerDocument: throw StateError on missing meta instead of silently returning; callers now correctly surface the failure - OfflineTransitionService.runDrainAndWipe: add drainTimeout param (default 5 min) and apply it to pushSync() so a hung drain cannot block forever - MetaService: forward checkAndSyncDoctypes exceptions to onMetaSyncFailure callback instead of only printing - QueryResult: add empty const for convenience - Tests updated for StateError and DELETE writeback behaviour
ankitDhwani
left a comment
There was a problem hiding this comment.
PR Review — Final Closure Report
om/offline_improvement → develop
Reviewer: Ankit + Claude Code (claude-sonnet-4-6)
Branch: om/offline_improvement
Base: develop
Final HEAD: c62a865
Review rounds: 3 (original · R2 after f029b7c · R3 after 33803c0 · patch c62a865)
Date closed: 2026-05-12
Test suite: 791 tests · all passing
Merge Verdict
APPROVED — unconditional. All tracked issues resolved, dismissed with evidence, or confirmed N/A.
No open items remain. The three issues contested by the dev team (L4, M8, M10) were independently re-verified against the current source and confirmed correctly dismissed.
Legend
| Symbol | Meaning |
|---|---|
| ✅ FIXED | Verified in code; test coverage confirmed where required |
| ✅ BY DESIGN | Intentional behaviour; documented in limitations.md or inline |
| ⏭️ DISMISSED | Re-verified against source — original concern was either a misread or has no practical impact against a real Frappe server. Evidence cited. |
| N/A | Was never a real issue; confirmed before R1 |
Critical (5 / 5 resolved)
| ID | Status | Verified at |
|---|---|---|
| C1 | ✅ FIXED (R3) | link_option_service.dart:140–170 — _normalizeFiltersForDoctype handles 3-tuples; length == 3 guard prepends doctype. 4 regression tests in test/services/link_option_service_offline_test.dart. |
| C2 | ✅ FIXED (R2) | write_queue.dart — outer-tx failure drains remaining completers with completeError; verified in write_queue_test.dart. |
| C3 | ✅ FIXED (design) | pull_apply.dart — _locallyDirtyStatuses gate: ['dirty','failed','conflict','blocked']. Parent-shields-children invariant. Covered by AC3 test phase 5. |
| C4 | ✅ FIXED (R2) | documents table, DocumentDao, Document model deleted. Single docs__* per-field storage. _pullOneInternal → applyServerDocument. |
| C5 | ✅ FIXED (R2) | frappe_sdk.dart — _initInFlight: Completer<void>? coalesces concurrent initialize() calls. |
High (14 original / 1 post-review — all resolved)
| ID | Status | Verified at |
|---|---|---|
| H1 | ✅ FIXED (R2) | _rewritePayload / _resolveLinkedUuids deleted; UuidRewriter.rewrite() via PayloadAssembler. |
| H2 | ✅ FIXED (R2) | meta_service.dart — fetchAndStoreInDb calls clearDocTypeCache(doctype) after store. |
| H3 | ✅ FIXED (R3) | sdk_meta_dao.dart:30 — writeOfflineMode uses INSERT OR REPLACE INTO sdk_meta (true upsert). |
| H4 | ✅ FIXED (R2) | sync_service.dart — syncNow() try/catch per phase; exceptions logged, execution continues. |
| H5 | ✅ FIXED (R2) | frappe_sdk.dart — dispose() nulls 24 fields. |
| H6 | ✅ FIXED (R3) | sync_service.dart:75 — isOnline() first line: if (!offlineMode.enabled) return false; |
| H7 | ✅ FIXED (R2) | offline_transition_service.dart — Timer.periodic emits live drainedRecords count. |
| H8 | ✅ FIXED (R2) | response_writeback.dart — mobile_uuid-first match, position-index fallback. |
| H9 | ✅ FIXED (R2) | filter_parser.dart — _validateOrderBy splits on , and validates each segment individually. |
| H10 | ✅ BY DESIGN | Children inherit parent status. Documented in limitations.md. |
| H11 | ✅ FIXED (R2) | frappe_sdk.dart — _pendingDrain tracks in-flight upgrade pull; awaited before null-out. |
| H12 | ✅ FIXED (R2) | documents table eliminated; single write path via LocalWriter. |
| H13 | ✅ FIXED (R2) | Both pull paths (PullApply.applyPage + applyServerDocument) write exclusively to docs__*. |
| H14 | ✅ FIXED (R2) | Single storage backend; dual-write path fully eliminated. |
| POST-4 | ✅ FIXED (c62a865) | response_writeback.dart:50–65 — DELETE early-exit hard-deletes parent + cascades children + retires outbox row before the name/docname check. Covered by test/sync/response_writeback_test.dart. |
Medium (22 M-series + 2 N-series + 2 AC-series + 3 POST-REVIEW — all resolved)
| ID | Status | Verified at |
|---|---|---|
| M1 | ✅ FIXED (R2) | ChildTableInfo consolidated; _ChildInfo alias deleted; typedefs in child_table_info.dart. |
| M2 | ✅ FIXED (R2) | closure_builder.dart — ConcurrencyPool used in BFS level; pullSyncMany uses pool. |
| M3 | ✅ BY DESIGN | Parent-shields-children on conflict — same as C3; documented. |
| M4 | ✅ FIXED (R3) | link_field_coordinator.dart:220 — unawaited(_processQueue()); recursive finally call removed; while loop drains. |
| M5 | ✅ FIXED (c62a865) | meta_service.dart:260 — outer catch now calls onMetaSyncFailure?.call('checkAndSyncDoctypes', e). Silent outer failure path closed. |
| M6 | ✅ FIXED (R2) | _uuidToServerName and duplicate UUID helpers deleted. |
| M7 | ✅ FIXED (R2) | pullSyncMany uses await DeviceTier.detect() for concurrency sizing. |
| M8 | ⏭️ DISMISSED | client.doctype.list() (doctype_service.dart) returns [] for any response missing a message key — Frappe HTTP error bodies fall into this path. HTTP-level errors throw at the client layer before reaching the parser. The one remaining edge case (message: null cast error) is caught by PullEngine._runDoctype's generic catch, cursor is not advanced, and the doctype retries next cycle. No data loss, no stall. Against a real Frappe server this never fires in production. |
| M9 | ✅ FIXED (R2) | meta_service.dart — LRU cleared in fetchAndStoreInDb. |
| M10 | ⏭️ DISMISSED | DependencyGraphBuilder is a 49-line pure static helper: buildOutgoing(DocTypeMeta) → DepGraph. No async, no BFS, no metaFetcher. It is called by ClosureBuilder inside the BFS loop as an edge extractor for one node. ClosureBuilder owns the traversal. These are complementary classes with distinct responsibilities, not duplicated BFS implementations. Original review was a misread. |
| M11 | ✅ FIXED (R2) | sync_state.dart — SyncState implements operator ==. |
| M12 | ✅ FIXED (R2) | outbox_dao.dart — markFailed + BlockedByUpstream on retry exhaustion. |
| M13 | N/A | LinkDecorator uses server_name/mobile_uuid; no dynamic SQL interpolation risk. |
| M14 | ✅ FIXED (c62a865) | offline_repository.dart:607 — missing meta now throws StateError instead of silent print + return. Callers see a proper failure; outbox row is marked failed rather than incorrectly marked done. |
| M15 | ✅ FIXED (c62a865) | bulk_watermark_probe.dart — class-level docstring added: "Best-effort contract: endpoint absence is not an error. detect catches all exceptions and marks the probe unavailable — callers must handle available: false." |
| M16 | ✅ FIXED (R2) | meta_migration.dart — MetaMigration.apply wrapped in db.transaction(). |
| M17 | ✅ FIXED (R2) | sdk_meta_dao.dart — readOfflineMode returns OfflineMode.fallback when set_at IS NULL. |
| M18 | ✅ FIXED (R2) | connectivity_watcher.dart — subscription keeps _cachedOnline live across reconnects. |
| M19 | ✅ FIXED (R2) | Both pull paths write the same cursor format; SIG-9 drift eliminated. |
| M20 | ✅ FIXED (R2) | tier_computer.dart — TierComputer.compute + dependency scan enforces parent-before-child ordering. |
| M21 | ✅ FIXED (R2) | idempotency_strategy.dart — IdempotencyStrategy L1/L2/L3 covers lost-ACK on INSERT. |
| M22 | ✅ FIXED (R2) | app_database.dart — wipeOfflineDocumentTables deletes pending_attachments. |
| N1 | ✅ FIXED (R3) | frappe_sdk.dart:1081 — _runUpgradeClosurePull calls _pullEngine!.run(closure, allowedDoctypes: pullable) inside _syncService!.protect(). |
| N2 | ✅ FIXED (R3) | frappe_sdk.dart:1094 — _runPullForController delegates to _runUpgradeClosurePull; real Set<String> of deferred doctypes returned for SIG-2 re-pull. |
| AC3 | ✅ FIXED (R3) | test/offline_cycle_integration_test.dart (425 lines) — 5-phase end-to-end: write → push → pull → local edit → conflict. Real SQLite + typed fake HTTP. |
| AC4 | ✅ FIXED (R3) | Same as N1 — PullEngine wired into production path; dead-code label removed. |
| POST-1 | ✅ FIXED (c62a865) | frappe_sdk.dart:1205–1223 — _pendingDrain now awaited before _syncStateNotifier?.close(). Comment added explaining producer/consumer ordering. SyncStateNotifier.value= has no isClosed guard (verified), so ordering matters. |
| POST-2 | ✅ FIXED (c62a865) | offline_transition_service.dart:88,133 — drainTimeout = const Duration(minutes: 5) parameter added; pushSync().timeout(drainTimeout, onTimeout: ...) applied. |
| POST-3 | ✅ FIXED (c62a865) | frappe_sdk.dart:623 — logout() now wraps _authService!.logout() in _syncService!.protect(...). SyncMutex acquired before clearAllData() runs. |
Low (17 L-series — all resolved or dismissed)
| ID | Status | Verified at |
|---|---|---|
| L1 | ✅ FIXED (R2) | frappe_sdk.dart — _mobileUuid cached; _resolveMobileUuid() avoids repeat platform calls. |
| L2 | ✅ FIXED (R2) | sync_result.dart — SyncResult.empty(status: SyncStatus.offlineModeDisabled) is distinguishable from other empty results. |
| L3 | ✅ FIXED (R2) | connectivity_watcher.dart — _onEvent deduplicates same-state events. |
| L4 | ⏭️ DISMISSED | write_queue_test.dart verified: zero Timer or Future.delayed calls. Tests await sequential queue submissions only. fakeAsync controls fake clocks and zones — it cannot control real SQLite futures from sqflite_ffi's in-process execution. Real async is the correct pattern here and results are deterministic regardless of CI runner speed. |
| L5 | N/A | No hard-coded route literals found in codebase; already refactored before review. |
| L6 | ✅ FIXED (R3) | form_screen.dart:864,887 — ValueListenableBuilder<bool> scopes rebuilds to AppBar save button and spinner only; FrappeFormBuilder tree is not in the rebuild scope. |
| L7 | ✅ FIXED (R3) | doc_type_meta.dart:52 — per-field try/catch; failures log fieldname and fall back to DocField(fieldname:..., fieldtype:'Data'). |
| L8 | ✅ FIXED (R2) | field_type_mapping.dart — 35+ field types mapped; Duration, Rating, Geolocation covered. |
| L9 | ✅ FIXED (R2) | normalise_for_search.dart — normalizeForSearch strips U+200D Zero Width Joiner. |
| L10 | ✅ FIXED (R2) | doc/release-2.0/ replaces prior misleading docs. |
| L11 | ✅ FIXED (R2) | CHANGELOG.md updated with 2.0.0 storage migration entry. |
| L12 | ✅ FIXED (R2) | SyncErrorsScreen has empty-state UI. |
| L13 | ✅ FIXED (R2) | Drain screen renders '${drainedRecords} of ${totalRecords}'. |
| L14 | ✅ FIXED (R2) | TODOs reduced from 14 to 1 (dep_graph.dart:61, acceptable — explains deferred incoming-edge wiring). |
| L15 | ✅ FIXED (c62a865) | query_result.dart:26 — static const empty = QueryResult<Map<String, dynamic>>(...) added. |
| L16 | N/A | LinkDecorator checks name == null || name.isEmpty — no crash path. |
| L17 | ✅ FIXED (c62a865) | depends_on_evaluator.dart:7–11 — trust boundary docstring added: "expressions come from server-side DocType meta configured by Frappe admins… NOT sandboxed — do not pass user-supplied strings to evaluate." |
Summary
| Category | Total tracked | Fixed / BY DESIGN | Dismissed (evidence) | N/A |
|---|---|---|---|---|
| Critical | 5 | 5 | 0 | 0 |
| High (incl. POST-4) | 15 | 15 | 0 | 0 |
| Medium (incl. N, AC, POST) | 30 | 27 | 2 (M8, M10) | 1 (M13) |
| Low | 17 | 13 | 1 (L4) | 2 (L5, L16) |
| Total | 67 | 60 | 3 | 3 |
All 67 tracked IDs are resolved. Zero items remain open.
Dismissed Items — Evidence Summary
These were not fixed by the dev because they do not require fixing.
M8 — PullPageFetcher Non-List Response
doctype_service.dart:list() returns [] when the Frappe response has no message key (covers most error body shapes). HTTP-level errors (4xx/5xx) throw at the client before the parser runs — PullEngine._runDoctype catches, logs via debugPrint, skips cursor advance, retries next cycle. No data loss; no stall. Practically unreachable against a Frappe server that returns RFC-compliant HTTP error responses.
M10 — ClosureBuilder / DependencyGraphBuilder "Duplication"
DependencyGraphBuilder.buildOutgoing(DocTypeMeta) is a 49-line pure synchronous edge extractor called inside ClosureBuilder's BFS loop. It has no metaFetcher, no visited set, no traversal. The original review incorrectly described it as a parallel BFS implementation. The two classes are correctly separated: one orchestrates traversal, the other computes the edges for one node.
L4 — fakeAsync Inconsistency
write_queue_test.dart contains zero Timer or Future.delayed calls. Tests use await on queue submissions backed by sqflite_ffi (in-process, no platform channel). fakeAsync manipulates fake clocks and zones — it has no effect on real SQLite futures. Real async is appropriate and deterministic here.
New Issues Found This Review Cycle (None)
No new defects were found during the c62a865 verification pass. All changed files (10) addressed exactly the tracked items without introducing regressions.
Final Test Suite
| Round | Tests passing |
|---|---|
| R2 baseline | ~750 |
| R3 (HEAD 33803c0) | 790 |
| Patch (HEAD c62a865) | 791 |
791 tests, 0 failures. The 1 net new test covers the ResponseWriteback DELETE path (POST-4).
Verified against git diff develop...om/offline_improvement at HEAD c62a865. All code citations checked against current branch. No source files were modified during this review — findings only.
…ovement # Conflicts: # lib/src/services/link_option_service.dart # lib/src/ui/widgets/fields/link_field.dart # lib/src/ui/widgets/fields/table_multi_select_field.dart # test/services/link_option_service_resolve_filters_test.dart
deepak-dhwani
left a comment
There was a problem hiding this comment.
Great effort on the scope of this PR — the architecture, test coverage, and documentation are thorough. A few issues need addressing before this merges, ranging
from correctness bugs to doc gaps.
Critical
SyncMutex.protect() — TOCTOU race (lib/src/concurrency/sync_mutex.dart)
When two concurrent callers are both awaiting _busy!.future, they both resume in the same microtask turn. The while re-check runs but neither has yet set _busy,
so both see null and both proceed — the mutex provides no protection. Additionally, the second caller overwrites the first's Completer, so any third waiter on
the abandoned Completer will hang forever.
tryProtect() is safe. Only protect() is affected.
Suggested fix — promise-chaining pattern:
Future _chain = Future.value();
Future protect(Future Function() body) {
final result = chain.then(() => body());
chain = result.then(() {}, onError: (_) {});
return result;
}
PushEngine — unbounded recursive merge loop (lib/src/sync/push_engine.dart)
The guard if (row.retryCount < 1) is meant to allow three-way merge exactly once then fall through to markConflict. But row.retryCount (the outbox field) is
never incremented on the merge path — only docs__.sync_attempts is. Since row.retryCount is always 0 on a freshly-dequeued row, the merge fires unconditionally
on every TimestampMismatchError. The recursive call to _process(updated) at line 521 also carries retryCount == 0, so this recurses without bound for any
persistently conflicting row.
Fix: gate on sync_attempts from docs__ (consistent with what _dispatchOnce increments), or increment outbox.retry_count inside _autoMergeAndRetry
before re-queuing.
SdkMetaDao.writeOfflineMode — INSERT OR REPLACE zeroes schema_version and bootstrap_done (lib/src/database/daos/sdk_meta_dao.dart)
INSERT OR REPLACE in SQLite is DELETE + INSERT. Only the three named columns survive — schema_version, bootstrap_done, and session_user_json revert to their
column defaults (0, 0, NULL). The first time a user toggles offline mode, bootstrap_done becomes 0 and schema_version becomes 0, causing any code gating on
these fields to behave incorrectly.
Fix:
await _db.rawUpdate(
'UPDATE sdk_meta SET offline_enabled = ?, offline_enabled_set_at = ? WHERE id = 1',
[enabled ? 1 : 0, setAtMs],
);
Worth adding a test that calls writeOfflineMode then reads back schema_version and bootstrap_done to prevent regression.
High
Non-WriteQueue merge path — torn state on crash (push_engine.dart)
When writeQueueResolver is null, the merged doc write and the outbox state update happen in two separate db.update calls. A crash between them leaves
docs__ with merged values but the outbox row still in_flight. On restart, resetInFlightToPending re-queues it; _dispatchOnce then builds the payload
from the already-merged row, so the next ThreeWayMerge treats the post-merge state as the local delta — silently discarding the user's original edit. The
WriteQueue path wraps both operations in a single transaction correctly; the non-WriteQueue path should do the same.
WriteQueue — SAVEPOINT failure leaves caller permanently suspended (lib/src/concurrency/write_queue.dart)
If txn.execute('SAVEPOINT $sp') throws, the inner catch attempts ROLLBACK TO $sp (which will also fail since the savepoint was never created), logs the rollback
error, and returns — but p.completer is never settled. The caller's await wq.submit(...) hangs forever. The outer catch only drains items still in _queue;
items already dequeued but not yet completed are silently abandoned.
Fix: unconditionally call p.completer.completeError(e, st) in the SAVEPOINT catch block before attempting the rollback.
ThreeWayMerge — null clear treated as "unchanged"; empty base discards all local edits (lib/src/sync/three_way_merge.dart)
Two related issues:
- _eq(null, null) returns true. If a user explicitly clears a field that was already null in base, localChanged evaluates to false and the server value wins —
the intentional clear is silently lost. - When push_base_payload is null (any row written before this feature was introduced), base is {}. base[k] returns null for every key, so on a document where
many local fields are also null, localChanged = false across the board and all local edits are discarded in favour of server values.
Medium
OutboxDao.recordSave() — no transaction around multi-step query+mutate (lib/src/database/daos/outbox_dao.dart)
The SELECT (find collapsable rows) and the subsequent DELETE/INSERT/UPDATE are separate statements with no wrapping transaction. A concurrent recordSave for the
same (doctype, mobileUuid) can observe the same collapsable row between the first caller's read and delete, resulting in two live outbox rows for what should
be one collapsed update.
Fix: wrap the body of recordSave() in await (db as Database).transaction((txn) async { ... }).
resolveConflict(pullAndOverwriteLocal) — local mirror stuck in conflict permanently (lib/src/services/sync_controller.dart)
When serverName is empty (an INSERT that never reached the server), markDone clears the outbox row but the docs__ row remains with sync_status =
'conflict'. Since the outbox row is gone, no retry path will touch it — the document appears permanently broken with no recovery path.
Documentation gaps (will cause compile errors for consumers following the migration guide)
Wrong method names in breaking-changes.md and migrating-from-1.x.md:
- Both documents reference offlineRepository.createDocument(doctype:, data:) — the actual method is saveDocument(doctype:, data:).
- The migration table shows deleteDocument(localId: ...) — the actual signature is deleteDocument(doctype:, mobileUuid:).
- updateDocumentData(localId: ...) also does not match the actual API.
OfflineModeNotifier is not exported from lib/frappe_mobile_sdk.dart despite being referenced in four constructor signatures in breaking-changes.md §2. Consumers
who wire services manually cannot type this parameter.
DocumentEntity removal is undocumented. It was a public type in 1.x (return type of all DocumentDao methods). Any consumer who typed List will
get a compile error with no hint in the migration guide.
SyncController constructor uses internal types (OutboxDao, RunPullFn, etc.) that are not exported. The docs should explicitly state that SyncController must be
obtained via sdk.syncController and not constructed directly.
Minor
- pending_attachments.top_parent_uuid has no FOREIGN KEY ... ON DELETE CASCADE to outbox.mobile_uuid. Orphaned attachment rows accumulate when outbox entries
are hard-deleted. - SchemaApplier.apply uses CREATE TABLE without IF NOT EXISTS. A concurrent isolate calling apply for the same doctype will fail with "table already exists".
Adding IF NOT EXISTS costs nothing. - wipeOfflineDocumentTables drops all docs__* tables but does not reset bootstrap_done or offline_enabled in sdk_meta. After a wipe, bootstrap_done = 1 remains,
and the next sync startup may skip re-bootstrap against empty tables. - No onDowngrade handler — if a user rolls back to a 1.x build, the app will hard-crash rather than show a user-visible error. A graceful handler with a clear
message is safer for a publicly distributed SDK. - Both CI checks are green, but actions/checkout@v4 (Node.js 20) needs bumping before September 16 — same as PR #35.
The core feature design is solid and the test suite is extensive. Addressing the critical and high items (particularly the mutex race, the merge loop, and the
INSERT OR REPLACE bug) with targeted regression tests covering those specific paths will get this to merge-ready.
… UX polish Port of om/logic_correction fixes to om/offline_improvement. - DoctypeService.list: guard non-List Frappe message shape → treat as empty page - RestHelper.get/getPublic: expose per-call timeout and maxRetries overrides so boot/splash callers can fast-fail instead of burning the default 30s × 3-retry budget - PermissionService.syncFromApi: accept timeout param; remove silent catch so callers own error handling - DoctypeMetaDao: add tableNameFor() helper — getTableName() ?? normalize fallback - child_schema / parent_schema: IF NOT EXISTS on all CREATE TABLE/INDEX for idempotency - MobileHomeScreen: suppress full-screen spinner on background syncComplete$ reloads — only show it on first paint when the list is empty - FieldStyle: default showLabel/showDescription to false - FrappeSDK: isInitialSync flag brackets boot pull; fast-fail permissions probe (10s); schema pre-create before closure pull; retryInitialMetaAndDataSync() public API; clearLastError() on logout; schemaReconciler wired into SyncEngineBuilder - OfflineRepository: reconcileParentTableForMeta() public entrypoint - SyncEngineBuilder: wire schemaReconciler; bump default pageSize 100→500; add limitStart - Cursor: add start field for offset pagination; markComplete resets start=0 - PullEngine: SchemaReconcilerFn typedef; invoke reconciler per doctype; stall guard restricted to complete=true cursors only - PullPageFetcher: dual-mode pagination — initial sync uses limit_start offset, incremental uses modified >= filter with limit_start=0 - SyncStateNotifier: recordLastError() / clearLastError() for boot error surfacing - Tests: Cursor.start round-trip, initial-sync limit_start advance, incremental cursor assertions, stall guard pre-seeding
Push engine - Break the merge-retry recursion loop: the slim outbox no longer carries `retry_count`, so the `row.retryCount < 1` guard always saw 0 and `_autoMergeAndRetry` recursed unbounded on any persistently conflicting row. Replace with an explicit `mergeAttempted` flag passed through `_process`. - Wrap the non-WriteQueue merge writeback in a single `db.transaction` so a crash between the docs__ update and the outbox state flip can't tear state. The WriteQueue path was already transactional. - Guard `_autoMergeAndRetry` against an empty `push_base_payload`. Without a baseline, ThreeWayMerge can't distinguish "user left null" from "user cleared", and a row with many null fields would silently lose local edits. Bail out to `markConflict` instead. sdk_meta - `writeOfflineMode` was `INSERT OR REPLACE`, which is DELETE + INSERT in SQLite — wiping `schema_version`, `bootstrap_done`, and `session_user_json` to defaults. Switch to a single-statement `INSERT ... ON CONFLICT(id) DO UPDATE SET` so the offline-mode columns are surgically upserted and the rest is preserved. - `wipeOfflineDocumentTables` now resets `bootstrap_done = 0` in the same transaction (the per-doctype mirrors that bootstrap built are dropped by wipe). Other sdk_meta fields untouched. Conflict resolution - `resolveConflict(pullAndOverwriteLocal)` deleted the outbox row when there was no server snapshot to fetch but left `docs__.sync_status = 'conflict'` — the row was stuck in the errors UI with no recovery path. Add an optional `clearLocalConflict` callback on SyncController; wire it from SyncEngineBuilder to flip the docs__ row back to `dirty` (best- effort, logged on failure). Attachments - `deleteDocument` on a never-pushed doc cancels the INSERT and hard-deletes the docs__ row; pending_attachments with that top_parent_uuid would orphan and retry forever. Add a `deleteForTopParent` helper on PendingAttachmentDao and clear attachments inside the same delete transaction. Public surface - Export `OfflineModeNotifier` from the SDK barrel — referenced by four constructor signatures in the migration docs but previously not in `lib/frappe_mobile_sdk.dart`. Migration docs (doc/release-2.0) - breaking-changes.md / migrating-from-1.x.md: fix method-name references (`createDocument` / `updateDocumentData` / `deleteDocument(localId:)` -> `saveDocument` / `deleteDocument(doctype:, mobileUuid:)`). - breaking-changes.md §1.3: document `DocumentEntity` removal with a 1.x -> 2.0 before/after. - breaking-changes.md §2 and whats-new.md §6: note that SyncController is not user-constructable; obtain via `sdk.syncController` (constructor takes unexported internal types). Tests (+6 regression cases, 814 passing) - push_engine: persistent TimestampMismatch markConflict + empty base markConflict - sdk_meta_dao: writeOfflineMode preserves sibling columns - sync_controller: serverName=null branch calls clearLocalConflict - offline_repository: deleteDocument clears pending_attachments - wipe_offline_document_tables: resets bootstrap_done but keeps schema_version + session_user_json Push-backs (no change): - SyncMutex.protect() TOCTOU — Dart microtasks run serially; existing 3-waiter FIFO test passes. - WriteQueue savepoint completer leak — completer IS settled at line 81 in the catch block. - OutboxDao.recordSave no txn — all 4 call sites wrap with txn. - SchemaApplier IF NOT EXISTS — already present. - onDowngrade handler — downgrade not supported; hard-crash is intentional, not silent data loss. - actions/checkout@v4 bump — CI scope, separate PR. See docs/pr36_sr_review_1_summary.md for the full breakdown.
…o om/offline_improvement
At a glance
✅ Fixed (9 items + 2 partial)Critical / High / Medium#2 — PushEngine recursive merge loopFile: Bug: Fix: Replaced the broken counter guard with an explicit Test: #3 —
|
P0–P5 coverage uplift driven by docs/test_coverage_checklist.md. Baseline 814 → 1,274 tests. flutter analyze clean. Coverage highlights (measured via `flutter test --coverage`): - 50+ files at 100% (api/auth, api/client, api/document_service, api/attachment_service, api/exceptions, api/frappe_document, 3 new DAOs, permission_service, app_status_service, sync_state_notifier, plus text/password/read_only/rating/button widgets, etc.) - 30+ files at ≥80% (push_engine 88.79%, rest_helper 88.74%, link_option_service 88.79%, unified_resolver 80.77%, doctype_service 82.61%, workflow_service 83.33%, translation_service 96.77%, link_field_coordinator 99.43%) Layers covered: - api/*: rest_helper, doctype_service, document_service, auth, attachment_service, client, exceptions, frappe_document - services/*: link_field_coordinator, workflow, translation, permission, app_status, sync_controller (cascade), sync_engine_builder (clearLocalConflict), meta_service (cache), offline_repository (lookups), sync_service (pull phases) - database/daos/*: link_option, auth_token, doctype_permission - sync/*: push_engine (SUBMIT/CANCEL/DELETE/WriteQueue), sync_state_notifier - query/*: unified_resolver parent-filter translation - ui/widgets/fields/*: data, check, date, datetime, time, select, numeric, text, phone, password, read_only, rating, duration, html, button, attach, geolocation, image, link (static options), searchable_select, field_factory, child_table - ui/widgets/*: sync_error_banner (message parsing), form_builder (lifecycle) - ui/*: login_screen_style Remaining 0% (deferred to integration): full-screen UI widgets (login_screen, form_screen, sync_status_screen, etc.) and OAuth/secure-storage code that requires platform-channel mocks. Tracked in docs/test_coverage_checklist.md and covered by Appium/Playwright suites.
deepak-dhwani
left a comment
There was a problem hiding this comment.
PR #36 — SDK 2.0 offline-first — Round 2 Review
Round-1 fixes verified
All 10 round-1 items are correctly fixed. The test suite additions (+460 tests) are solid. Completer, WriteQueue savepoint, and OutboxDao call-site push-backs
are accepted — the code analysis confirms the dev's reasoning was correct there.
BLOCKER B1 — SyncMutex: concurrent waiters both acquire the lock
lib/src/concurrency/sync_mutex.dart:~237
When N callers are blocked on await _busy!.future, the holder sets _busy = null before calling c.complete(). All N continuations queue as microtasks, all see
_busy == null, and all enter the critical section simultaneously. Two concurrent pullSyncMany calls can run against the same SQLite table in parallel.
Fix: set _busy = newCompleter before completing the old one, so late-waking microtasks loop again.
BLOCKER B2 — _migrateV2ToV3 wipes sdk_meta sibling columns
lib/src/database/app_database.dart:~165
ConflictAlgorithm.replace = DELETE + INSERT. An upgrading device with offline_enabled=1 or session_user_json loses those values silently.
Fix: ON CONFLICT(id) DO UPDATE SET schema_version = excluded.schema_version (same pattern as writeOfflineMode).
BLOCKER B3 — AtomicWipe silently no-ops if primary .db deletion fails
lib/src/services/atomic_wipe.dart:~1473
The same try/catch covers all three file deletions. A deletion failure on the main file reopens the old database — no actual wipe occurs, no error returned.
Fix: rethrow for suffix == ''; swallow only for -wal/-shm.
BLOCKER B4 — deleteDocument attachment cleanup failure silently swallowed
lib/src/services/offline_repository.dart:~3004
Inner on DatabaseException eats the error and continues. Transaction commits with docs__ row deleted but pending_attachments rows intact — uploader retries
forever.
Fix: remove the inner catch or rethrow; let the outer transaction roll back.
BLOCKER B5 — form_screen.dart TOCTOU: network drop between isOnline check and HTTP call loses user edit
lib/src/ui/form_screen.dart:~568
The offline save path is only taken when isOnline() returns false at check-time. A drop during the await gap (payload assembly, UUID fetch) hits the server
branch, fails, and sets _errorMessage — edit is never persisted offline.
Fix: in the catch block of the server path, fall through to the saveDocument offline path.
BLOCKER B6 — form_screen.dart setState after dispose in _save() finally
lib/src/ui/form_screen.dart:~487, ~760
setState(() => _isSaving = false) in the finally block has no mounted guard — widget may be disposed before it runs.
Fix: add if (mounted) to both setState(() => _isSaving = ...) sites.
HIGH H1 — Cursor infinite loop when all rows on a non-final page are tie-skipped
lib/src/services/sync_service.dart:~4302–4335
If all rows on a non-final page share the same modified as the cursor and are tie-skipped, pageLastModified stays null and neither advance branch fires. The
cursor doesn't move; the next call re-fetches the same page. Triggered by any bulk import where many records get the same modified timestamp — symptom is a
doctype that never finishes pulling.
Fix: when all rows on a non-final page are skipped, advance cursor to (cursorModified, last_name_on_page).
HIGH H2 — clearLocalConflict sets sync_status='dirty' with no outbox row → stuck unsynced forever
lib/src/services/sync_controller.dart:~260
For pullAndOverwriteLocal + serverName == null branch: resolveConflict deletes the outbox row AND sets sync_status='dirty'. No outbox entry will ever drive a
push. Document appears unsynced forever.
Fix: set sync_status='synced' for this branch — the INSERT never reached the server, the local copy is ground truth with no pending work.
HIGH H3 — Pull engine stall guard is dead code from page 2 onwards
lib/src/sync/pull_engine.dart:~1164
scratch.complete is always false after page 1 (Cursor.advance() defaults complete=false). Any bulk-edit scenario where multiple rows share the same modified
will loop forever from page 2.
Fix: remove the scratch.complete condition from the stall guard.
HIGH H4 — clearAllData crash window: drops tables then rebuilds schema in two separate operations
lib/src/database/app_database.dart:~362
A process kill between the DROP transaction commit and _onCreate completing leaves an empty, unversioned database. The SDK hard-bricks on next open.
Fix: fold _onCreate into the same transaction as the drops, or add a schema_rebuilt sentinel checked in onOpen.
HIGH H5 — Table names interpolated into raw SQL without quoting
lib/src/query/filter_parser.dart:~790, lib/src/query/unified_resolver.dart:~1193, 1331, 1344
$tableName, $childTable, $parentTable are interpolated without double-quote wrapping. Any Frappe doctype whose normalized name is a SQLite reserved word
(values, table, group) produces a parse error. (wipeOfflineDocumentTables already quotes correctly — apply the same pattern here.)
Fix: "$tableName" everywhere; escape embedded " as "".
HIGH H6 — _retrySyncRow and pullSync errors silently swallowed, no user feedback
lib/src/ui/form_screen.dart:~1102, lib/src/ui/document_list_screen.dart:~719
Retry failures print to console only; pullSync failure leaves list empty/stale with no message.
Fix: SnackBar in both catch blocks.
HIGH H7 — ThreeWayMerge._eq uses identity equality for List values — silent field loss
lib/src/sync/three_way_merge.dart:~2152
List == List is identity, not deep equality. Any JSON-array or multi-select field always appears "locally changed" even when unchanged, so the server's version
is silently discarded on every auto-merge.
Fix: add a List branch in _eq using const DeepCollectionEquality().equals(a, b).
HIGH H8 — UuidRewriter: Dynamic Link reads sibling value before sibling is rewritten
lib/src/sync/uuid_rewriter.dart:~2315
If a sibling Link field appears after its Dynamic Link in payload.entries order, the sibling hasn't been rewritten yet when the Dynamic Link reads it.
resolveServerName(mobileUuid_as_doctype, value) finds no match and throws BlockedByUpstream.
Fix: two-pass rewrite — resolve all non-Dynamic-Link fields first, then Dynamic Links.
MEDIUM (grouped)
- M1 attachment_pipeline.dart:115 — backoff uses attempt+1; the 2s first-retry slot is never used (harmless, misleading)
- M2 push_engine.dart:~1719 — merged row set to pending before markInFlight; concurrent runOnce could double-dispatch
- M3 meta_migration.dart:~1942 — addedFields loop missing _columnExists guard (unlike addedIsLocalFor/addedNormFor)
- M4 meta_migration.dart:~1930 — DROP INDEX swallows all DatabaseException, not just "no such index"
- M5 meta_differ.dart + meta_migration.dart — typeChanged list computed but never acted upon; type mismatches silently ignored
- M6 outbox_dao.dart:~773 — recordSave lacks Transaction enforcement; safe today, race if called with bare Database
- M7 connectivity_watcher.dart:~112 — PlatformException from checkConnectivity() hard-fails initialize() instead of degrading to offline
- M8 sync_engine_builder.dart:~3510 — 4 unconditional [DIAG resolveServerName] print(...) calls will flood production logs
- M9 frappe_sdk.dart — _syncCompleteController leaks if _doInitialize throws after its creation
- M10 form_screen.dart — offline-save SnackBar text identical to server-save; user can't distinguish queued vs confirmed
- M11 logout_guard_dialog.dart — outbox count query failure silently bypasses the logout guard
- M12 sync_progress_screen.dart / sync_errors_screen.dart — onPause/onCancel/onRetry futures discarded by button onPressed; errors swallowed
- M13 link_field.dart — no debounce on syncComplete$ listener; N concurrent re-fetches on multi-doctype sync
- M14 form_screen.dart — didChangeAppLifecycleState fires for off-screen tabs in IndexedStack/PageView
6 blockers + 8 highs need to be fixed before merge. The most critical path is: B1 (SyncMutex concurrent access) → H1 (cursor stuck on bulk-import
timestamps) → B2 (v2→v3 migration wipes user prefs) → B5/B6 (form save crash + edit loss) → H2 (document stuck unsynced forever after conflict resolution).
The test suite (+460 tests, all layers covered) is excellent. Once the above are fixed this is close to ready.
…< 3.24 compat
Round-2 review items spanning correctness, durability and UX:
H4 app_database: clearAllData DROP loop + base rebuild now share one
transaction so a process kill cannot leave the DB with dropped
application tables but an unchanged user_version (next open would
skip both onCreate and onUpgrade and hard-brick the install).
H2 sync_engine_builder: empty-server-name conflict resolution flips
docs__ back to `synced` (not `dirty`) — the outbox row is about to
be deleted, so there is no pending push to drive.
H7 three_way_merge: deep equality for List/Map; default identity-`==`
was treating every decoded-JSON container as locally-changed and
discarding server updates.
M2 push_engine: scheduled retry writes `in_flight` directly to close
the gap where a concurrent drain could re-pick a `pending` row.
M4/M5 meta_migration: rethrow non-"no such index" DROP failures; log a
typeChanged warning so column-type drift is visible during audit.
M6 outbox_dao.recordSave: debug-assert it runs inside a Transaction so
the collapse rules can rely on read+write atomicity.
M7 connectivity_watcher: catch PlatformException from the initial
probe (older Android OEMs without network-info permission) and
default to offline instead of aborting SDK.initialize.
M8 sync_engine_builder: gate the resolveServerName DIAG dumps to
kDebugMode — release builds were paying for an extra `SELECT *`
per resolve and flooding production logs.
M9 frappe_sdk: close the orphan _syncCompleteController if
_doInitialize throws after creating it.
M12 sync_errors / sync_progress / document_list / form_screen: button
actions await the future and surface failures via SnackBar instead
of letting them become uncaught zone errors.
M14 form_screen: didChangeAppLifecycleState gates on ModalRoute.isCurrent
so a multi-tab IndexedStack host doesn't refetch outbox N× per
foreground.
Other:
- sdk_meta_dao.writeOfflineMode: replace `INSERT … ON CONFLICT … DO
UPDATE` (SQLite ≥ 3.24, fails to compile on Android < 9) with
UPDATE-then-INSERT-OR-IGNORE in a txn. Preserves schema_version /
bootstrap_done / session_user_json the same way the UPSERT did.
- offline_repository.deleteDocument: stop swallowing pending_attachments
cleanup failures during hard-delete; sweep queued attachments before
the soft-delete tombstone too, so an already-pushed DELETE doesn't
leave orphan uploads racing the server-side cascade.
- atomic_wipe: rethrow on primary `.db` delete failure so the next
open doesn't mask the real error as "table already exists"; -wal /
-shm deletes remain best-effort.
- attachment_pipeline: fix off-by-one in backoff index (`backoff[i+1]`
was skipping the first delay and risking a range overshoot at the
last attempt).
- form_screen: guard setState in catch/finally with `mounted` to avoid
the post-dispose assert if the user navigates away mid-save.
Tests updated for every behavior change above.
PR #36 round-2 — Final Summary✅ Fixed (19 items applied across 17 numbered fixes)
Tests added: B3, B4.1, B4.2, H2 (updated existing), H7 (×4), M1, M3, M4. Mechanical / structural fixes verified by static review + the full 1,284-test suite passing. ⏭ Pushed back (10 items — claim refuted by code/docs)
📊 Totals
|
deepak-dhwani
left a comment
There was a problem hiding this comment.
PR #36 — SDK 2.0 offline-first — Round 3 Review
Round-2 fixes verified
All 18 items addressed in this commit are correctly fixed. Highlights:
- _onCreateBody called inside the same transaction as the DROP loop — crash window closed
- clearLocalConflict correctly sets synced (not dirty) with clear reasoning in the comment
- ThreeWayMerge._eq deep equality for List and Map — clean recursive implementation
- _runAndSurface helper pattern for button futures is the right approach; context.mounted checked before SnackBar
- writeOfflineMode UPDATE-then-INSERT-OR-IGNORE is correct and covers SQLite < 3.24 (Android 8)
- AtomicWipe rethrow on primary .db failure, best-effort on wal/shm — correct split
BLOCKER B1 — SyncMutex: concurrent waiters all acquire the lock
lib/src/concurrency/sync_mutex.dart — not touched in this commit.
The holder sets _busy = null before calling c.complete(). All N waiting microtasks resume in the same flush and all see _busy == null, so all enter the critical
section simultaneously. Two concurrent pullSyncMany calls can write to the same SQLite table in parallel.
The fix is to claim the new lock before completing the old completer:
Future protect(Future Function() body) async {
while (_busy != null) await _busy!.future;
final c = Completer();
_busy = c; // claim BEFORE any await
try {
return await body();
} finally {
_busy = null;
c.complete();
}
}
BLOCKER B5 — form_screen.dart: TOCTOU race loses user edit on mid-save network drop
lib/src/ui/form_screen.dart — mounted guards were added (B6, correctly fixed), but the TOCTOU itself is not addressed.
_save() calls isOnline(), stores serverReachable, then assembles the payload across multiple awaits. If the network drops during that gap, the server branch
throws, the catch block sets _errorMessage — and the offline saveDocument path is never reached. The edit is lost.
Fix: in the catch block of the server-first path, fall through to the offline saveDocument call rather than only setting _errorMessage.
HIGH H1 — Cursor infinite loop when a full page is entirely tie-skipped
lib/src/services/sync_service.dart — not touched in this commit.
When all rows on a non-final page share the same modified timestamp as the cursor and are tie-skipped, pageLastModified stays null and neither advance branch
fires. The cursor does not move. The next call re-fetches the same page, skips all rows again, and repeats forever. Triggered by any bulk import where many
records land with the same modified value — symptom is a doctype that never finishes pulling with no error surfaced.
Fix: when all rows on a non-final page are skipped, advance the cursor to (cursorModified, last_name_on_page) to guarantee forward progress through the tie
group.
HIGH H3 — Pull engine stall guard is dead code from page 2 onwards
lib/src/sync/pull_engine.dart — not touched in this commit.
scratch.complete is always false after the first page (Cursor.advance() defaults complete=false). The stall guard only fires on page 1. Any scenario where page
2+ contains rows all sharing the cursor's modified value loops forever.
Fix: remove the scratch.complete condition from the stall guard, or check the stall condition unconditionally for incremental pulls.
HIGH H5 — Table names interpolated into raw SQL without quoting
lib/src/query/filter_parser.dart, lib/src/query/unified_resolver.dart — not touched in this commit.
$tableName, $childTable, and $parentTable are interpolated directly into raw SQL strings without double-quote wrapping. Any Frappe doctype whose normalized name
is a SQLite reserved keyword (table, values, group, order, index) will cause a parse error. wipeOfflineDocumentTables already quotes table names correctly —
apply the same pattern here.
Fix: wrap all interpolated table names as "$tableName"; escape any embedded " as "".
HIGH H8 — UuidRewriter: Dynamic Link reads sibling value before sibling is rewritten
lib/src/sync/uuid_rewriter.dart — not touched in this commit.
For a Dynamic Link field, the rewriter reads payload[siblingFieldname] to get the target doctype. If the sibling Link field appears after the Dynamic Link in
payload.entries iteration order, the sibling has not been rewritten yet. resolveServerName is called with a mobile_uuid string as the doctype, finds no match,
and throws BlockedByUpstream.
Fix: two-pass rewrite — resolve all non-Dynamic-Link fields first, then resolve Dynamic Links using the already-rewritten sibling values.
Verdict
Two blockers (B1, B5) and four highs (H1, H3, H5, H8) remain unaddressed. The fixes in this commit are clean and correct. Once these six items
are handled the PR is ready to merge.
---
**Round 3 Review — Reply**
Thanks for the thorough re-read. After verifying each item against the current code on `om/offline_improvement` (head `cfa82c0`), I'm pushing back on all six. Citations below.
---
### B1 — SyncMutex race ❌ Not a bug
**Claim:** When the holder releases, all waiters resume in the same flush, all see `_busy == null`, all enter the critical section simultaneously.
**Verification:** `lib/src/concurrency/sync_mutex.dart:36-48`
The current `protect()` is:
```dart
Future<T> protect<T>(Future<T> Function() body) async {
while (_busy != null) {
await _busy!.future;
}
final c = Completer<void>();
_busy = c;
try {
return await body();
} finally {
_busy = null;
c.complete();
}
}Why this is safe:
The proposed fix in the review is byte-identical to the existing code (including placing B5 — Mid-save TOCTOU loses user edit ❌ Edit isn't lostClaim: Network drop during a server-first save throws, catch sets Verification: When the catch fires (lines 768-776), it sets
The user sees the error banner; their typing remains in the form; they can retry on the next online attempt. No data loss. Why "fall through to offline saveDocument" doesn't apply: The server-first branch only runs when H1 — Cursor infinite loop on tie-skip ❌ Not infinite; proposed fix would regress the cursorClaim: When a non-final page is entirely tie-skipped, Verification (no infinite loop): The pagination uses Verification (proposed fix would regress the cursor): The tie-skip rule at line 535 is: if (modCmp == 0 && serverId.compareTo(cursorName) <= 0) continue;For a page to be entirely tie-skipped, every row must satisfy There is a real but mild concern: the persisted cursor stays put when every page in a session is tie-skipped. Fix would require a separate "scan progress" marker independent of the cursor. H3 — Pull engine stall guard dead from page 2 onwards ❌ Not a bugClaim: Verification: The engine doesn't use if (cursor.complete) {
next = Cursor(
modified: last['modified'] as String?,
name: last['name'] as String?,
complete: true,
);
}So For initial sync ( H5 — Unquoted table names → SQLite reserved keyword crash ❌ Not reachableClaim: Verification: String normalizeDoctypeTableName(String doctype, {String prefix = 'docs__'}) {
// ... lowercase, [^a-z0-9]+ → '_', collapse '_+', strip leading/trailing '_'
return '$prefix$body';
}Output is guaranteed to match H8 — UuidRewriter reads sibling before sibling is rewritten ❌ Not a bugClaim: For a Dynamic Link, the rewriter reads Verification: Frappe Dynamic Link semantics: The rewriter only enters the rewrite branch for if (field != null &&
(field.fieldtype == 'Link' || field.fieldtype == 'Dynamic Link')) {The sibling field is never rewritten, so reading it always returns the original string value regardless of iteration order. Summary
No code changes from this round. Happy to walk through any of the citations live if anything still seems off. |
deepak-dhwani
left a comment
There was a problem hiding this comment.
PR #36 — SDK 2.0 Offline-First — Round 3 (Comprehensive) Review
This review covers all 307 changed files across sync, database, services, concurrency, query, UI, and docs layers.
BLOCKER B1 — PushEngine.runOnce() has no concurrent guard
lib/src/sync/push_engine.dart — runOnce() has no _running flag or mutex. SyncController calls it from multiple triggers simultaneously (user save, connectivity
restore, syncNow()). Each concurrent call independently resets in_flight rows back to pending and re-fetches the outbox, producing duplicate dispatches. A
document in in_flight gets pushed twice in parallel; the second wins the server write but the first's ResponseWriteback overwrites server_name/modified with a
stale value.
Fix: Add a bool _running = false guard at the top of runOnce() — if (_running) return; _running = true; — with a finally { _running = false; } block, mirroring
the pattern already used in WriteQueue._kick().
BLOCKER B2 — WriteQueue completers resolve before the outer transaction commits
lib/src/concurrency/write_queue.dart — Inside db.transaction((txn) async { ... }), p.completer.complete(r) is called after each savepoint release, while the
enclosing sqflite transaction has not yet committed. Dart microtasks scheduled by complete() fire at the next await in the loop (the next SAVEPOINT command), so
await wq.submit(...) callers in PullEngine and PushEngine receive a success signal before the data is durable. If the outer transaction subsequently fails
(disk full, lock timeout, DB closed), callers have already advanced their cursors or marked outbox rows done — data loss without an observable error.
Fix: Move p.completer.complete(r) / p.completer.completeError(e, st) out of the transaction callback. Collect results in a list inside the transaction, then
resolve completers after db.transaction(...) returns.
BLOCKER B3 — Cursor never graduates to incremental for zero-row doctypes
lib/src/sync/cursor.dart — Cursor.toJson() returns null when isNull is true (modified == null && name == null). For a doctype that has no rows on the server,
the initial pull's result.rows.isEmpty breaks immediately, scratch remains Cursor.empty, and markComplete() produces a cursor that still satisfies isNull.
PullEngine then calls if (cursorJson != null) metaDao.setLastOkCursor(...) — the null check silently skips the write. The cursor is never persisted, and on
every restart the doctype is re-fetched as a full initial pull instead of as an incremental delta.
Fix: Change toJson() to if (isNull && !complete) return null; so a completed-but-empty cursor is serialized as {"complete": true}.
BLOCKER B4 — FilterParser LIKE values are not wildcard-escaped
lib/src/query/filter_parser.dart — In the 'like' / 'not like' case, the value is passed directly as a SQLite parameter without escaping % and . SQLite's LIKE
treats _ as "any single character", so a filter for "PRJ_001" also matches "PRJ0001", "PRJx001", etc. Frappe server-side escapes these before passing to
MariaDB; offline queries diverge from online results. This also allows a caller to inject arbitrary %/ patterns via user-facing filter inputs.
Fix: Before binding the value, escape it — value.replaceAll('\', '\\').replaceAll('%', '\%').replaceAll('', '\') — and append ESCAPE '\' to the
generated SQL fragment.
BLOCKER B5 — FormScreen online delete orphans the local docs__ row
lib/src/ui/form_screen.dart — In the !offlineEnabled branch, widget.api!.document.deleteDocument(...) is called to remove the record server-side, but the
corresponding hardDeleteDocument call that existed in the prior version was removed. The local docs__ row is never cleaned up. The document reappears
in list screens on next app load and persists until the next incremental pull overwrites it.
Fix: After the successful API call in the !offlineEnabled branch, call await widget.repository.hardDeleteDocument(doctype: widget.meta.name, mobileUuid:
widget.document!.localId).
BLOCKER B6 — BaseFieldStyle showLabel / showDescription defaults silently changed to false
lib/src/ui/widgets/fields/base_field.dart — showLabel and showDescription were changed from true to false as constructor defaults. Any consumer app that
constructs BaseFieldStyle() without these parameters now renders all fields with no labels and no descriptions. This is a silent breaking change with no
changelog entry and no deprecation path.
Fix: Revert defaults to true to maintain backward compatibility. If false is the intended default for the new internal FrappeFormBuilder style, introduce a
named constructor — e.g. BaseFieldStyle.hiddenLabels() — instead of changing the primary default.
HIGH H1 — ThreeWayMerge._eq(null, null) returns false
lib/src/sync/three_way_merge.dart — static bool _eq(Object? a, Object? b) returns false when either argument is null because of if (a == null || b == null)
return false. When both base and ours have null for a field (the common case for optional fields never set by the user), localChanged evaluates to !_eq(null,
null) == true, so every null-valued field is treated as a local edit. On every push-with-merge, those fields overwrite whatever the server set for them — a
silent partial PUT that discards server updates to optional fields.
Fix: Change the guard to if (a == null && b == null) return true; if (a == null || b == null) return false;.
HIGH H2 — ResponseWriteback child-row fallback ignores rows-affected count
lib/src/sync/response_writeback.dart — When matching a child row by mobile_uuid returns 0 rows (updated == 0), the fallback txn.update(..., where: 'parent_uuid
= ? AND parentfield = ? AND idx = ?', ...) runs but its return value is discarded. If that fallback also matches 0 rows (child row not yet written, or idx
mismatch), the child's server_name and modified are never updated. On the next sync, UuidRewriter sees server_name == null and blocks the row as
BlockedByUpstream.
Fix: Capture the return value of the fallback update and log a warning (or throw to surface as a sync error) when it is 0.
HIGH H3 — AttachmentPipeline upload and markDone are not atomic
lib/src/sync/attachment_pipeline.dart — uploader(...) and dao.markDone(...) are in the same try block. If the upload succeeds but markDone throws (DB error,
constraint violation), the retry loop re-uploads the file. Frappe's attach_files_to_document hook creates a second File doctype record pointing to the same
binary — a duplicate attachment indistinguishable from the original.
Fix: After a successful upload, call markDone in a separate try/catch. If markDone fails, surface a distinct error rather than retrying the upload.
HIGH H4 — PermissionService.syncFromApi() no-throw contract removed without notice
lib/src/services/permission_service.dart — The prior implementation wrapped the API call in try/catch and returned null on any exception. The new implementation
removes that catch and lets exceptions propagate. Any call site in FrappeSDK or a consumer app that checks result == null for failure (the documented contract)
will now crash on NetworkException, TimeoutException, or any HTTP error. This contract change is not mentioned in the changelog.
Fix: Either restore the catch-and-return-null behavior, or document the new throwing contract explicitly and audit all call sites that relied on the null-return
pattern.
HIGH H5 — FrappeSDK login() can trigger concurrent _initialMetaAndDataSync calls
lib/src/sdk/frappe_sdk.dart — login() fires unawaited(_initialMetaAndDataSync()). The _initInFlight guard lives inside initialize(), not inside
_initialMetaAndDataSync() itself. Two rapid login() calls — or a login() racing an already-in-progress initialize() — produce two concurrent
_initialMetaAndDataSync() executions, both writing to doctype_meta and sdk_meta tables simultaneously without a write-order guarantee.
Fix: Lift the _initInFlight guard into _initialMetaAndDataSync() itself (or introduce a dedicated _metaSyncInFlight completer) so concurrent callers coalesce
onto one execution regardless of entry point.
HIGH H6 — MetaMigration backfill holds SQLite write lock for full duration
lib/src/services/meta_migration.dart — A backfill migration that iterates all rows and applies column updates in a loop does so inside a single write
transaction. For large datasets this holds the SQLite write lock for seconds, blocking any concurrent write from WriteQueue, PullApply, or the outbox. On first
launch after upgrade, all background sync operations queue and may timeout.
Fix: Batch the backfill in chunks of 200–500 rows with a commit between batches.
HIGH H7 — resolveConflict deletes the outbox row before applying the server version
lib/src/services/offline_repository.dart — resolveConflict removes the outbox conflict row first, then attempts to apply the server snapshot to the local docs__
table. If the apply step throws, the conflict row is gone with no recovery path — the document is stuck with pre-conflict local data, no outbox entry, and a
synced status the next pull will not revisit.
Fix: Reverse the order: apply the server snapshot first, then delete the outbox row only on success.
HIGH H8 — deleteDocument swallows child cascade failures
lib/src/services/offline_repository.dart — The child cascade loop logs 'child cascade delete failed' and continues. A partial delete (parent row removed, some
child rows remaining) is reported to the caller as success. Orphaned child rows with a dangling parent_uuid cause constraint violations or ghost rows on the
next pull.
Fix: Collect failures across all children; if any fail, either rethrow after attempting all (so the caller can surface the error) or re-insert the parent row
and fail the whole operation atomically.
HIGH H9 — outbox_row hard-casts nullable columns without null check
lib/src/database/daos/outbox_dao.dart — Column reads that use as String or as int on nullable database columns will throw a Dart cast exception on any row where
that column is NULL (e.g. newly inserted rows before server round-trip). This silently crashes the outbox drain for the affected row rather than surfacing a
recoverable error.
Fix: Use as String? (or ?? defaultValue) for all nullable column reads and handle null explicitly.
MEDIUM M1 — ConnectivityWatcher stream errors are unhandled
lib/src/concurrency/connectivity_watcher.dart — Errors from the connectivity plugin's stream (e.g. permission revoked on Android 13+) are not caught. An
unhandled stream error propagates downstream and surfaces as an unhandled exception in the isolate.
Fix: Add .handleError((e) { debugPrint('ConnectivityWatcher: $e'); }) on the stream transformation.
MEDIUM M2 — ConcurrencyPool._drain() in finally block can mask the original error
lib/src/concurrency/concurrency_pool.dart — If the awaited task throws and _drain() also throws inside finally, the drain error replaces the original error in
the stack trace.
Fix: Wrap _drain() inside finally in its own try/catch that logs the drain error separately without rethrowing.
MEDIUM M3 — FormBuilder dataForDepends evaluates against completeFormData, not live formValues
lib/src/ui/widgets/form_builder.dart — dataForDepends is built from completeFormData (assembled before the visibility pass), not from formValues (the live
user-entered map). For fields whose visibility depends on other fields that are themselves hidden, the evaluation uses a snapshot that may still include stale
values from hidden fields, causing sections to remain visible when they should be collapsed.
Fix: Use formValues as the basis for dataForDepends instead of completeFormData.
MEDIUM M4 — OfflineTransitionService._forceExited set after c.complete()
lib/src/services/offline_transition_service.dart — _forceExited = true is assigned after c.complete(). Dart's microtask scheduler fires the completer's
listeners before the assignment, creating a brief window where a re-entrant caller sees _forceExited == false and enters the transition again.
Fix: Set _forceExited = true before calling c.complete().
MEDIUM M5 — SessionUserService stream emits before DB write is confirmed
lib/src/services/session_user_service.dart — The service emits the new user on its stream before await db.update(...) has returned. A listener that immediately
reads from the DB may observe the pre-write value.
Fix: Await the DB write to completion before emitting on the stream.
MEDIUM M6 — Bare print() calls in production paths
lib/src/sdk/frappe_sdk.dart — Several print(...) statements appear in production code paths unconditionally, including in release builds. These may leak
internal state (doctype names, auth status, error messages) to crash reporters or log aggregators.
Fix: Replace with debugPrint(...) or gate with if (kDebugMode).
MEDIUM M7 — RunFn typedef accidentally exported from the public barrel
lib/frappe_mobile_sdk.dart — show SyncController, ConflictAction, DeleteCascadePlan, RunFn — RunFn is an internal implementation typedef. Exporting it widens
the public API surface with a symbol consumers cannot meaningfully use, and any future rename becomes a breaking change.
Fix: Remove RunFn from the show clause on that export line.
MEDIUM M8 — pubspec.yaml specifies meta: any
pubspec.yaml — A published library should constrain its dependencies. meta: any allows the resolver to pick any version including future majors, producing
version-skew conflicts for downstream consumers who pin their own meta.
Fix: Replace with a bounded constraint, e.g. meta: ">=1.9.0 <3.0.0".
DOC BLOCKER D1 — sdk.unifiedResolver does not exist (4 locations)
doc/release-2.0/breaking-changes.md, doc/release-2.0/whats-new.md, doc/OFFLINE_FIRST.md — The correct accessor on FrappeSDK is sdk.resolver (UnifiedResolver get
resolver). Every code sample calling sdk.unifiedResolver.resolve(...) throws NoSuchMethodError at runtime. There are 4 occurrences across these files.
Fix: Replace all 4 instances of sdk.unifiedResolver with sdk.resolver.
DOC BLOCKER D2 — whats-new.md sample calls createDocument (renamed to saveDocument)
doc/release-2.0/whats-new.md — The sample calls sdk.offlineRepository.createDocument(...). That method was renamed to saveDocument in this PR. The sample
produces a compile error as written.
Fix: Replace createDocument with saveDocument in the sample.
DOC BLOCKER D3 — cancelInitialSync() documented but not implemented
doc/OFFLINE_FIRST.md — await ctrl.cancelInitialSync() is shown as a public API. SyncController has no such method. A consumer following this sample gets a
compile error.
Fix: Either implement cancelInitialSync() or remove the section before release.
DOC BLOCKER D4 — MigrationBlockedScreen documented but not exported
doc/OFFLINE_FIRST.md — MigrationBlockedScreen is shown as a widget consumers embed. It is not exported from frappe_mobile_sdk.dart. Consumers cannot use it
without importing internal paths.
Fix: Add MigrationBlockedScreen to the barrel export, or remove the section from docs.
DOC BLOCKER D5 — OFFLINE_FIRST.md describes the dropped documents table as active
doc/OFFLINE_FIRST.md — "Two-store design — every save goes to the legacy JSON blob store (documents) and a normalized per-doctype docs__
documents table was dropped in the v2→v3 migration. This description is false for SDK 2.0 and will mislead integrators.
Fix: Remove the two-store description and replace it with the current single-store (docs__) design.
What's done well
- SyncMutex while-loop re-check — the while (_busy != null) pattern correctly serializes concurrent waiters with no gap; confirmed across all prior rounds.
- PullPageFetcher incremental cursor — complete: true is explicitly preserved in the incremental branch, so the stall guard fires correctly on every page.
- UuidRewriter Dynamic Link — reading the sibling field for the doctype name string before rewriting the link value is correct; the sibling is never itself a
UUID. - Cursor stall guard scope — guarding only cursor.complete == true pages is correct; initial offset-based pagination cannot stall.
- UPDATE-then-INSERT-OR-IGNORE — correct SQLite <3.24 compatibility workaround for the absence of ON CONFLICT DO UPDATE.
- AtomicWipe rethrow on primary .db deletion failure — correct; a failed primary DB deletion is not silently swallowed.
- ThreeWayMerge deep equality for List/Map — the recursive _eq for containers correctly fixes identity-comparison false positives on decoded JSON.
- kDebugMode gate on diagnostic prints — sync engine diagnostic output correctly gated.
- OutboxDao savepoint pattern — using named SQLite savepoints inside an outer transaction for per-task rollback isolation is the right approach.
Verdict
Six blockers and five doc blockers must be resolved before this ships as a public SDK. The concurrent-guard gaps (B1, B2, H5) and merge
correctness issues (H1, B3) are highest priority — they affect data integrity for every consumer app.
Adds the offline-first data layer to the SDK. The SDK now maintains two parallel stores per document — the legacy
documentsblob (push source-of-truth) and per-doctypenormalized tables
docs__<doctype>(offline read path) — and ships pull/push engines, a unified read path, conflict handling, and the supporting UX.What's in this PR
Schema & migration
outbox,pending_attachments,sdk_meta— DDL is idempotent (IF NOT EXISTS/OR IGNORE) so an_onUpgradere-entry after a partial failure no longercorrupts the migration
docs__<doctype>parent and child schemas with__norm/__is_localcolumnsdocumentstable behind a feature gateMeta service & dependency graph
BulkWatermarkProbe,MetaDiffer,MetaMigration,MetaService.ensureUpToDate()ClosureBuilderfans out level-by-level meta fetch in parallelDependencyGraphBuilderfor prefetch orderingPull engine
SyncService.pullSync— phase-tracked cursors (initial/resume/incremental), tie-skip on resume, look-ahead pagination (page N+1 only fires when page N was full)SyncService.pullSyncMany— bounded-concurrency (4 workers) closure drainpull_apply.dart— server-time-aware conflict detection; a normal offline edit no longer surfaces as a conflict on the next pull (compares stored vs. incoming servermodified)Push engine & conflict handling
PushEngine,PayloadAssembler,ResponseWriteback,ThreeWayMerge,TierComputermobile_uuidreferences to server names before push (dependency-aware, parent-first)bulkGetWithChildrenreplaces N+1 per-name GETs with a singlemobile_sync.get_docs_with_childrenround-trip; graceful fallback to per-name (bounded at 20 concurrent) on404 from older
mobile_controldeploymentsidempotency_strategyandattachment_pipelineWrite path & repository
LocalWritermirrors saves into per-doctype tables in a single transaction; reusesmobile_uuidfrom payload or generates a fresh one; splits child tables into their owndocs__<child>tables; silently handles tables that don't yet exist (initial sync hasn't run)OfflineRepositorydelegates parent + child writes throughLocalWriter; preservesmobile_uuidfrom payload (prevents UUID mismatch after offline → online)UnifiedResolveris the single read path for offline queries;_translateParentFiltersmaps Frappe's virtualparentcolumn toparent_uuidfor child-table queries;LinkOptionServiceconsumes the resolver instead of the API clientUX & lifecycle
FrappeSDKperforms a single connectivity probe at boot; offline launch skips bootstrap to avoid the ~4-minute HTTP timeout cascadeFormScreenconnectivity gate before save; treatsserverId == nullas INSERT (offline-created docs going online for the first time)Other changes riding along
LinkFilterBuilderhook — row/parent-aware Link filter overrides wired SDK → FormBuilder → fieldsLinkOptionServicecaching simplification + form field label handling.releaserc/update-version.pyrestored to develop stateServer dependency
bulkGetWithChildrencallsmobile_sync.get_docs_with_children(whitelisted endpoint shipped in themobile_controlFrappe app). Older deployments fall back to per-name/api/resource/<doctype>/<name>GETs automatically.Test plan
Unit + integration tests added under
test/(~120 new test files):flutter test test/database/— schema, DAOs, table naming, normalize_for_search, v1→v2 migrationflutter test test/sync/— pull_apply (server-advanced conflict check), pull_engine, pull_page_fetcher, push_engine, payload_assembler, three_way_merge, tier_computer,uuid_rewriter, attachment_pipeline, response_writeback
flutter test test/services/— local_writer, offline_repository (per-doctype + LocalWriter wiring), bulk_watermark_probe, closure_builder, meta_differ, meta_migration,link_option_service offline path, session_user_service
flutter test test/query/— filter_parser (basic, like, null parity, OR, safety, timespan), unified_resolver, link_decorator, frappe_timespanflutter test test/concurrency/— concurrency_pool, write_queue, isolate_parser, connectivity_watcher, device_tierflutter test test/ui/— sync_status_bar, sync_progress_screen, sync_errors_screen, delete_cascade_prompt, force_logout_confirm, logout_guard_dialog,document_list_filter_chip
flutter analyze --no-fatal-infospassesMigration notes for consumers
OfflineRepositoryconstructor now requires aLocalWriter. SDK builds it internally fromMetaService; consumers using the publicFrappeSDKneed no changes.LinkOptionServiceconstructor signature changed from(client)to(resolver, metaResolver). Internal — consumers usingFrappeSDK.linkOptionsare unaffected.mobile_sync.get_docs_with_children(inmobile_control≥ the version that ships this method). Falls back gracefully on 404 if absent.Out of scope / follow-up
MigrationBlockedScreenfrom the publicfrappe_mobile_sdk.dartsurfaceSyncController.acceptDeleteCascade()AuthService.logout()switch fromclearAllData()toAtomicWipe.wipe()ConnectivityWatcherinstantiation inFrappeSDK_initialMetaAndDataSync()between SDK and consumer appsdep_graph.linkEdgeCountByField()constant-1return (affects prefetch priority order only)