Skip to content

add dropcontact integration - GDPR-compliant B2B enrichment#78

Open
jackulau wants to merge 1 commit into
wespreadjam:mainfrom
jackulau:16
Open

add dropcontact integration - GDPR-compliant B2B enrichment#78
jackulau wants to merge 1 commit into
wespreadjam:mainfrom
jackulau:16

Conversation

@jackulau

Copy link
Copy Markdown

Closes #16

Summary

Adds a dropcontact_enrich node that integrates the Dropcontact API for GDPR-compliant B2B contact enrichment. Dropcontact is asynchronous: POST /batch returns a request_id, and the client must poll GET /batch/{request_id} until enrichment results are ready.

The integration follows the existing Apify pattern for in-node polling and the existing Slack/WordPress patterns for credential handling, field mapping, and testing.

Acceptance Criteria (from Issue #16)

  • Credential definition
  • Enrich operation
  • Async polling for results
  • Zod schemas
  • Unit tests

What's in this PR

New files

File Purpose
packages/nodes/src/integrations/dropcontact/credentials.ts dropcontactCredential — apiKey type, X-Access-Token header auth, via defineApiKeyCredential from @jam-nodes/core
packages/nodes/src/integrations/dropcontact/schemas.ts DropcontactEnrichInputSchema and DropcontactEnrichOutputSchema — fully typed with z.infer<>, no z.any()
packages/nodes/src/integrations/dropcontact/enrich.ts dropcontactEnrichNode via defineNode — POST + polling loop, explicit field mapping
packages/nodes/src/integrations/dropcontact/index.ts Barrel export for the integration
packages/nodes/src/integrations/dropcontact/__tests__/dropcontact.test.ts 42 vitest tests

Modified files

File Change
packages/core/src/types/node.ts Adds dropcontact?: { apiKey: string } to the NodeCredentials interface
packages/nodes/src/integrations/index.ts Re-exports the new integration
packages/nodes/src/index.ts Re-exports dropcontactEnrichNode, schemas, and credential; adds node to builtInNodes

Design decisions

Polling lives inside the node, not the engine

Dropcontact's async submission → poll contract is API-specific, not a generic cross-cutting concern. The CLAUDE.md architectural note says retry/cache/timeout/rate-limiting belong in the execution engine's ExecutionConfig, but polling for an API-specific completion signal is different: it's the API's normal flow. This matches the existing Apify integration (apify/run-actor.ts::pollUntilFinished).

Polling terminates on data[0] presence, NOT success === true alone

This is the highest-risk correctness issue in the integration. Dropcontact returns success: true on BOTH:

  • the initial POST /batch (meaning "submission accepted"), and
  • the final GET /batch/{request_id} when results are ready (meaning "enrichment complete")

A naive polling loop that exits on success === true would terminate on the first poll before any data is available. The implementation checks body.success === true && Array.isArray(body.data) && body.data.length > 0 as the terminal success condition, with a dedicated regression test (poll does NOT terminate on success:true with missing data).

Fetch-first polling order

The loop polls immediately on attempt 0 and only sleeps between attempts (skipping the sleep after the last attempt). This matches Apify's pattern and ensures the timeout input reflects real wall-clock semantics — a timeout: 1 sends exactly one poll, rather than sleeping 5 seconds first with no budget left.

POLL_INTERVAL_MS = 5000 is a module constant

Not a user input — following the Apify pattern. Tests use vi.useFakeTimers() + vi.runAllTimersAsync() to fast-forward.

Output mapping is explicit field-by-field

mapEnrichedRecord uses a private getString(obj, key) helper that returns null for missing keys, non-object values, or non-string field values. This:

  • defends against the API returning missing keys without using z.nullish() (which would loosen the contract)
  • implicitly strips extra undocumented fields from the Dropcontact response
  • avoids any as any or z.any() (per CLAUDE.md)

Output nullability is split

  • requestId and success are non-null (control fields — always present on a valid result)
  • creditsLeft is .nullable() (Dropcontact doesn't always include it)
  • Enriched contact fields are all .nullable() (the API may resolve some fields but not others — a partial match is a valid, successful result)

Single contact per call

Issue #16's input shape describes one contact per dropcontactEnrich invocation. Batching multiple contacts (with per-row error correlation) is deferred to a future issue if real workflows need it.

Implementation details

HTTP retry and error handling

Both the POST and the polling GETs use fetchWithRetry from packages/nodes/src/utils/http.ts with { maxRetries: 3, backoffMs: 1000, timeoutMs: 30000 } (same config used by Slack and Apify). That helper handles:

  • 401/403 → throws immediately, no retry (surfaced as { success: false, error })
  • 429 with Retry-After → retries with the documented delay
  • 5xx → retries with exponential backoff
  • Network errors → retries with backoff
  • Request timeout via AbortController

Failure modes

Every failure path returns { success: false, error: "..." } — the executor never throws. Credit exhaustion (success: false on POST) is surfaced via the reason field. Polling timeout surfaces as Dropcontact polling timed out after Xs.

Credentials

  • Adds dropcontact?: { apiKey: string } to the NodeCredentials type in @jam-nodes/core
  • The executor checks typeof apiKey !== 'string' || apiKey.length === 0 — missing credential object, missing dropcontact key, and empty-string apiKey all short-circuit to a clear error before any network request

Submitted POST body

buildSubmitBody only includes fields the user supplied, so unset fields are never sent to Dropcontact as undefined. Field names are translated to Dropcontact's snake_case convention (firstNamefirst_name, linkedinUrllinkedin, etc.).

Test plan

All tests are offline (mocked fetch via vi.stubGlobal) — no real Dropcontact API calls during test runs.

Test commands

# Run just the dropcontact tests
cd packages/nodes && npx vitest run src/integrations/dropcontact/__tests__/dropcontact.test.ts

# Run the full nodes test suite
npm test --workspace=@jam-nodes/nodes

Results

  • Dropcontact tests: 42 / 42 passing
  • Full nodes suite: 237 / 237 passing (195 pre-existing + 42 new, zero regressions)
  • Core typecheck: passes
  • Nodes typecheck: unchanged error count vs main (pre-existing errors in slack/* and google-sheets/* are not introduced or affected by this PR)

Coverage breakdown

Describe block Tests Covers
dropcontact credential 4 Metadata, schema accepts/rejects apiKey (valid, empty, missing)
dropcontact schemas 11 Input validation (all fields, defaults, invalid email, empty email, invalid language, zero timeout, negative timeout), output validation (accepts nulls, rejects missing/null requestId, rejects missing success)
dropcontactEnrichNode 24 Metadata; missing/empty/wrong credentials (3 distinct paths); POST URL/headers/body shape; POST body filtering; polling to data[0]; polling regression for success:true without data; success:false termination; credit exhaustion on POST; URL encoding of request_id and api_key; snake_case → camelCase mapping; missing fields → null; extra fields stripped; empty data:[]; all-null data[0]; polling timeout; 401; 5xx; network error mid-poll; 429 during polling; non-string field values → null; request_id missing from 2xx success; creditsLeft preservation
executeNode integration 3 Full engine path (schema validation, timeout, happy path)

Implements the Dropcontact enrich node per issue wespreadjam#16. Submits a single
contact to Dropcontact's async /batch endpoint and polls
/batch/{request_id} until enrichment results are ready or timeout expires.

- dropcontact apiKey credential (X-Access-Token header)
- dropcontactEnrichNode with Zod input/output schemas (no z.any)
- Fetch-first polling loop (POLL_INTERVAL_MS = 5000), matching apify pattern
- Polling terminates on data[0] presence (not just success:true, which
  Dropcontact returns on both submit-accepted and enrichment-complete)
- Explicit field-by-field snake_case -> camelCase mapping with null defaults
- Adds dropcontact to NodeCredentials interface in @jam-nodes/core
- Registers in builtInNodes and integrations barrel
- 42 vitest tests covering credentials, schemas, executor happy path,
  polling regression, timeout, auth errors, 5xx, 429, partial fields,
  and executeNode engine integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Integration] Dropcontact - GDPR-compliant B2B enrichment

1 participant