feat: cache autoconfig in store, init from store by LD-Sfalzon · Pull Request #593 · launchdarkly/ld-relay

LD-Sfalzon · 2026-03-16T01:50:00Z

Requirements

I have added test coverage for new or changed functionality
[ X] I have followed the repository's pull request submission guidelines
[ X] I have validated my changes against all supported platform versions

Related issues
if a relay proxy is unable to connec to LD Streaming service to retrieve auto config the persistent cache does not provide a "resileincy" and it cannot be used for auto scaling during a LD incident
Provide links to any issues in this repository or elsewhere relating to this pull request.

Describe the solution you've provided
auto config can now be saved to the persistent cache, encrypted by default with the auto config key (configurable)
I have tested this with Valkey only

Provide a clear and concise description of what you expect to happen.

Describe alternatives you've considered

Provide a clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context about the pull request here.

Note

Medium Risk
Adds a new encrypted persistence layer and startup race between cache and SSE stream, affecting initialization and consistency when AutoConfig is enabled. Risk centers on cache correctness, encryption/key management, and DynamoDB/Redis write behaviors under partial failures.

Overview
Adds persistent, encrypted caching for AutoConfig data so Relay can start serving environments/filters from Redis/Valkey or DynamoDB when AUTO_CONFIG_CACHE_KEY is set.

On startup, StreamManager now races an async cache read against establishing the SSE stream; cached content is applied immediately if it wins, and live stream put becomes authoritative and cancels any in-flight cache read. Subsequent put events write a full snapshot to the cache, while patch/delete events incrementally upsert/delete cached items.

Introduces internal/autoconfigcache with Redis and DynamoDB store implementations and AES-GCM encryption (key derived from AUTO_CONFIG_CACHE_ENCRYPTION_KEY or the AutoConfig key). Configuration/docs add the new cache settings plus validation to require Redis/DynamoDB (and a DynamoDB table when applicable).

^{Reviewed by Cursor Bugbot for commit aa72bf4. Bugbot is set up for automated code reviews on this repo. Configure here.}

…ent timeout

- Add initFromStoreFirst, cacheKey, and cacheEncryptionKey to AutoConfig config - Cache encrypted AutoConfig snapshot in Redis or DynamoDB; load on startup when LaunchDarkly is unreachable so Relay can serve from last known config - Update cache whenever stream sends a full put - Encryption key optional: defaults to AutoConfig key (SHA-256 derived) - Document new options in configuration.md Made-with: Cursor

relay/autoconfig_actions.go

…eletes correctly - Add MessageReceiver.Seed() to record cached envs/filters without emitting actions - Add StreamManager.SeedFromPutContent() so envReceiver/filterReceiver stay in sync with cache - Load from store returns PutContent; relay applies to handler then seeds stream before Start() - Fixes: cached envs no longer skip updates (ActionUpdate) and stale envs are removed by Retain Made-with: Cursor

relay/relay.go

config/config_validation.go

internal/autoconfigcache/store.go

- Close autoConfigCache in Relay.Close() to avoid leaking Redis/DynamoDB connections - Validate DYNAMODB_TABLE when InitFromStoreFirst with DynamoDB (cache uses global table) - Use strings.TrimSpace for cache key to match validation (remove custom trimCacheKey) Made-with: Cursor

internal/autoconfigcache/redis_store.go

When REDIS_TLS=true but URL is redis://, set TLSConfig from config so the cache store uses TLS (matches bigsegments/store_redis.go behavior). Made-with: Cursor

Replace the single-blob cache design with per-item storage matching the patterns in go-server-sdk-dynamodb and go-server-sdk-redis-redigo. This avoids DynamoDB's 400KB item size limit for large customers. Key changes: - Store interface changed from Get/Set []byte to GetAll/SetAll PutContent - Redis: uses a Hash (HGETALL/HSET) with MULTI/EXEC transactions - DynamoDB: individual items per env/filter with Query + BatchWriteItem, plus a checkSizeLimit guard matching the SDK pattern - Cached data now flows through the normal handlePut path via StreamManager.ApplyCachedPut, eliminating SeedFromPutContent, the race condition it caused, and the separate applyPutContentToHandler code path - StreamManager owns the cache write directly, removing the PutContentReceiver interface, cachingAutoConfigHandler wrapper, and the forwarding method on ProjectRouter

StreamManager now owns the full cache lifecycle: it reads from the cache on Start() before connecting the stream, and writes after each PUT. - Cache interface (GetAll + SetAll + Close) defined in autoconfig package - noopStore implements the interface as a null object, returned by NewStore when no backing store is configured - Eliminates nil checks — StreamManager always has a valid cache - Removes ApplyCachedPut — cache read/apply is internal to Start() - relay.go simplified: just creates the cache and passes it through

Derive the AES-256 key via SHA-256 from whatever string the user provides, rather than requiring exactly 32 bytes or base64-encoded 32 bytes. Update docs to remove the length restriction.

The cacheKey is user-provided specifically to namespace cache entries. Use it as-is for the Redis hash key and DynamoDB partition key rather than prepending ld:autoconfig:. Suffixes can be added later if we need to store additional data types under the same key.

internal/autoconfig/stream_manager.go

cursor · 2026-04-01T16:37:00Z

internal/autoconfigcache/dynamodb_store.go

+		}
+	}
+	return nil
+}


DynamoDB batch write silently drops unprocessed items

Medium Severity

The batchWrite method discards the BatchWriteItemOutput response (assigned to _), ignoring UnprocessedItems. DynamoDB's BatchWriteItem API can return partial successes under throttling or provisioned throughput limits. Unprocessed items are silently lost, potentially leaving the cache in an incomplete state. On next cold startup, missing environments or filters would not be served.

It isn't silently dropped as we issue a log statement. Continuing on failure in this case is better as this is serving solely as a remediation effort, and having some environments is better than no environments.

internal/autoconfigcache/redis_store.go

If Redis is temporarily down at startup, the cache should degrade gracefully rather than preventing the relay from starting. The ping was counterproductive to the resilience goal of the feature.

Log and continue instead of aborting if a single batch of 25 items fails. Partial cache data is better than none for resilience.

Add Upsert and Delete methods to the Cache interface so individual environment/filter changes from PATCH and DELETE stream events are persisted to the cache immediately. Previously only PUT events updated the cache, so changes between PUTs would be lost on restart. Also consolidates the env:/filter: key prefix constants into a shared cacheField helper used by both Redis and DynamoDB stores.

internal/autoconfigcache/dynamodb_store.go

config/config.go

Setting AUTO_CONFIG_CACHE_KEY is sufficient to enable AutoConfig caching. The separate AUTO_CONFIG_INIT_FROM_STORE_FIRST boolean was redundant — if you set a cache key, you want caching.

Add a Persist flag to PutContent. Data from the live stream is marked Persist=true so handlePut writes it to cache. Data restored from cache on startup has Persist=false (the default), so it flows through handlePut without a redundant cache write.

Each cached environment/filter is now serialized inside a CachedItem wrapper with kind, modelVersion, and data fields. On read, items with an unrecognized model version are skipped with a warning, giving us a clean path to handle format migrations in the future.

Each cache store now holds an internal context that is cancelled on Close(), terminating any in-flight operations. The caller's context is combined with the store's context using context.AfterFunc so that either cancellation terminates the operation. The mergeContext helper is shared between Redis and DynamoDB implementations.

internal/autoconfig/stream_manager.go

keelerm84 · 2026-04-06T19:05:50Z

internal/autoconfigcache/crypto.go

@@ -0,0 +1,57 @@
+package autoconfigcache


@launchdarkly/team-product-security mind taking a look at this new addition and let me know if it seems acceptable?

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{Reviewed by Cursor Bugbot for commit b450cc7. Configure here.}

internal/autoconfig/stream_manager.go

LD-Sfalzon and others added 4 commits March 5, 2026 13:52

fix: set explicit timeout for big segment revisions poll to avoid cli…

819410b

…ent timeout

drop timeout period

2d4ceb8

Merge origin/v8 into feature/autoconfig-cache-init-from-store

f6a7f7e

LD-Sfalzon requested a review from a team as a code owner March 16, 2026 01:50

cursor bot reviewed Mar 16, 2026

View reviewed changes

relay/autoconfig_actions.go Outdated Show resolved Hide resolved

cursor bot reviewed Mar 16, 2026

View reviewed changes

relay/relay.go Outdated Show resolved Hide resolved

config/config_validation.go Outdated Show resolved Hide resolved

internal/autoconfigcache/store.go Show resolved Hide resolved

cursor bot reviewed Mar 16, 2026

View reviewed changes

internal/autoconfigcache/redis_store.go Show resolved Hide resolved

fix: apply Redis TLS config in AutoConfig cache store

25679e1

When REDIS_TLS=true but URL is redis://, set TLSConfig from config so the cache store uses TLS (matches bigsegments/store_redis.go behavior). Made-with: Cursor

LD-Sfalzon changed the title ~~Feature/autoconfig cache init from store~~ feat: cache autoconfig in store, init from store Mar 17, 2026

keelerm84 added 6 commits March 31, 2026 13:51

Merge branch 'v8' into feature/autoconfig-cache-init-from-store

42cc63c

go fmt

87469ec

simplify: accept any string as cache encryption key

47f3ea9

Derive the AES-256 key via SHA-256 from whatever string the user provides, rather than requiring exactly 32 bytes or base64-encoded 32 bytes. Update docs to remove the length restriction.

cursor bot reviewed Apr 1, 2026

View reviewed changes

fix lint issue

1bebdad

cursor bot reviewed Apr 1, 2026

View reviewed changes

internal/autoconfigcache/redis_store.go Outdated Show resolved Hide resolved

keelerm84 added 4 commits April 1, 2026 12:48

fix: remove Redis ping check from cache store construction

3aa02c5

If Redis is temporarily down at startup, the cache should degrade gracefully rather than preventing the relay from starting. The ping was counterproductive to the resilience goal of the feature.

fix: continue DynamoDB batch writes on partial failure

9c262b9

Log and continue instead of aborting if a single batch of 25 items fails. Partial cache data is better than none for resilience.

gofmt

2762a87

kinyoklion reviewed Apr 2, 2026

View reviewed changes

internal/autoconfigcache/dynamodb_store.go Show resolved Hide resolved

kinyoklion reviewed Apr 2, 2026

View reviewed changes

config/config.go Show resolved Hide resolved

keelerm84 added 3 commits April 2, 2026 14:50

simplify: remove InitFromStoreFirst, CacheKey implies the feature

c900087

Setting AUTO_CONFIG_CACHE_KEY is sufficient to enable AutoConfig caching. The separate AUTO_CONFIG_INIT_FROM_STORE_FIRST boolean was redundant — if you set a cache key, you want caching.

keelerm84 added 4 commits April 2, 2026 15:48

feat: race cache read against stream connection

92f613f

gofmt

f6ded73

lint issues

2fa74ef

cursor bot reviewed Apr 3, 2026

View reviewed changes

internal/autoconfig/stream_manager.go Show resolved Hide resolved

address code review feedback

f72a2c1

cursor bot reviewed Apr 6, 2026

View reviewed changes

internal/autoconfig/stream_manager.go Outdated Show resolved Hide resolved

keelerm84 added 3 commits April 6, 2026 12:56

update cache cancel

708622f

more feedback

2c5bd52

fix nolint directive

32c54ff

cursor bot reviewed Apr 6, 2026

View reviewed changes

internal/autoconfig/stream_manager.go Show resolved Hide resolved

keelerm84 added 3 commits April 6, 2026 14:14

close on halt

ff10ebc

fix doc

975eec1

allow cache to finish even if stream errs out

b450cc7

keelerm84 reviewed Apr 6, 2026

View reviewed changes

keelerm84 requested a review from a team April 6, 2026 19:06

cursor bot reviewed Apr 6, 2026

View reviewed changes

internal/autoconfig/stream_manager.go Show resolved Hide resolved

ensure cache goroutine is cancelled

aa72bf4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cache autoconfig in store, init from store#593

feat: cache autoconfig in store, init from store#593
LD-Sfalzon wants to merge 33 commits intolaunchdarkly:v8from
LD-Sfalzon:feature/autoconfig-cache-init-from-store

LD-Sfalzon commented Mar 16, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Apr 1, 2026

Uh oh!

keelerm84 Apr 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

keelerm84 Apr 6, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LD-Sfalzon commented Mar 16, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Apr 1, 2026

Choose a reason for hiding this comment

DynamoDB batch write silently drops unprocessed items

Uh oh!

keelerm84 Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

keelerm84 Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LD-Sfalzon commented Mar 16, 2026 •

edited by cursor bot

Loading