Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
dfbe99b
KMS-649: Add RDF4J local pull/setup workflow and refresh local concep…
Feb 21, 2026
2774342
KMS-649: Consolidate RDF4J local tooling into bin/ scripts, simplify …
Feb 22, 2026
7d56318
KMS-649: Added locust files
Feb 22, 2026
002420e
KMS-649: Updates to hammer endpoints
Feb 22, 2026
1dc5e55
KMS-649: More updates to. hammer to mimic DSE behavior
Feb 22, 2026
1abe517
KMS-649: Does not retry if 404
Feb 22, 2026
46288ab
KMS-649: Added concurrent requests
Feb 22, 2026
a7a4121
KMS-649: Reverted to latest
Feb 22, 2026
8ae5642
KMS-649: Temp disabling cache.
Feb 22, 2026
1c25cfd
KMS-649: Added better logging when retries happen and in flight promi…
Feb 22, 2026
d6fe4e5
KMS-649: Remove in-process single-flight dedupe, increase concepts re…
Feb 22, 2026
b292a91
KMS-649: Guard API Gateway method caching behind stage cache enableme…
Feb 22, 2026
cfb67e4
KMS-649: Fix API Gateway stage config to omit cacheClusterSize/method…
Feb 22, 2026
d8e8a85
KMS-649: Add method-level /concepts throttling in CDK, set 25s SPARQL…
Feb 23, 2026
9b87ccb
KMS-649: Fix strict TypeScript typing in ApiCacheSetup method option …
Feb 23, 2026
318949a
KMS-649: Simplify SPARQL retry/timeout config by using fixed defaults…
Feb 23, 2026
6efa009
KMS-649: Hardcode SPARQL/read timeout in sparqlRequest and remove CON…
Feb 23, 2026
2518a53
KMS-649: Remove API Gateway access logging configuration from KmsStac…
Feb 23, 2026
fc2aca4
KMS-649: Remove unused createCsv test reset helper and document SPARQ…
Feb 23, 2026
348fcd3
KMS-649: Enable API cache cluster by default and align SPARQL/CSV tes…
Feb 23, 2026
16c779b
KMS-649: Remove obsolete duplicate-request test from sparqlRequest un…
Feb 23, 2026
3cb23da
KMS-649: Add Redis response caching + scheduled priming for concepts/…
Feb 25, 2026
bdb20af
KMS-649: Require Redis deployment subnets to be PRIVATE_WITH_EGRESS only
Feb 25, 2026
a3d0b61
KMS-649: Changed cron to run every 30 minutes
Feb 25, 2026
c378156
KMS-649: Switched to logger
Feb 25, 2026
038b785
KMS-649: Fixed csv so it doesn't page to 25
Feb 25, 2026
8d9f880
KMS-649: Consolidated README
Feb 25, 2026
0826b43
KMS-649: Add EventBridge publish->cache-prime trigger and expand Redi…
Feb 25, 2026
9fe3946
KMS-649: Updating to fix cors issue
Feb 25, 2026
b913c36
Revert "KMS-649: Updating to fix cors issue"
Feb 25, 2026
156968d
KMS-649: Consolidated Redis cache/client logic into redisCacheStore a…
Feb 26, 2026
39f23c4
KMS-649: Audit fix
Feb 26, 2026
fd0b78b
KMS-649: Removed defaults in deploy_bamboo.sh
Feb 26, 2026
3f06d69
KMS-649: Reverted deploy_bamboo.sh back to simpler version.
Feb 26, 2026
1fe8b38
KMS-649: More refactoring
Feb 26, 2026
b58650f
KMS-649: Simple renaming change
Feb 26, 2026
f040944
KMS-649: Cleaned up sparqlRequest
Feb 26, 2026
2d60c46
KMS-649: Javadoc setupRdf4j.js
Feb 26, 2026
5dce508
KMS-649: Cleaned up README
Feb 26, 2026
3360c08
KMS-649: Small README change
Feb 26, 2026
1ae43ac
KMS-649: Updated aws cdk lib
Feb 26, 2026
1f8b3e7
KMS-649: Ordered so cache is hit before version lookup
Feb 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,24 @@ cmr
cdk.context.json
cdk.out
infrastructure/rdfdb/cdk/cdk.context.json

# Spike/generated artifacts
*.log
log.txt
cdk/log.txt
data/mongo-spike/
data/mongo-spike/**/*.json
data/mongo-spike/**/*.ndjson
data/mongo-spike/**/*.xml
data/mongo-spike/**/*.rdf
scripts/spike-express/file.rdf
loadtest/locust/.DS_Store
loadtest/locust/**/__pycache__/
loadtest/locust/**/*.pyc
loadtest/locust/**/results*.csv
loadtest/locust/**/locust_results.csv
loadtest/locust/**/summary.csv
loadtest/locust/**/summary.csv~
loadtest/locust/
scripts/load/hammer_endpoints_sequential.js
data/scheme-size*
140 changes: 84 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,75 @@ To start local server (including rdf4j database server, cdk synth and sam)
npm run start-local
```

To run local server with SAM watch mode enabled
```
npm run start-local:watch
```

### Optional: Enable Redis cache in local SAM/LocalStack

By default, local `start-local` does not provision Redis in CDK. You can still test Redis caching by running Redis in Docker.
Local defaults are centralized in `bin/env/local_env.sh`.

1. Ensure the docker network exists:
```
npm run rdf4j:create-network
```

2. Start Redis on the same docker network used by SAM:
```
npm run redis:start
```

3. (Optional) override defaults in `bin/env/local_env.sh` or per-command, for example:
```
REDIS_ENABLED=true REDIS_HOST=kms-redis-local REDIS_PORT=6379 npm run start-local
```

4. Start local API:
```
npm run start-local
```

5. Verify cache behavior (published reads only):
- `GET /concepts?version=published`
- `GET /concepts/concept_scheme/{scheme}?version=published`

6. Check Redis cache memory usage:
```
npm run redis:memory_used
```

### Redis node types and memory (ElastiCache)

Common burstable node types for Redis/Valkey:

| Node type | Memory (GiB) |
| --- | --- |
| `cache.t4g.micro` | `0.5` |
| `cache.t4g.small` | `1.37` |
| `cache.t4g.medium` | `3.09` |
| `cache.t3.micro` | `0.5` |
| `cache.t3.small` | `1.37` |
| `cache.t3.medium` | `3.09` |

For full and latest node-family capacities (m/r/t and region support), see:
https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/CacheNodes.SupportedTypes.html

To disable local Redis cache again:
```
REDIS_ENABLED=false npm run start-local
```

### Invoke cache-prime cron locally

Runs the same Lambda used by the scheduled EventBridge cache-prime target, locally via SAM.
The script re-synthesizes `cdk/cdk.out/KmsStack.template.json` each run so local Redis env settings are baked into the template.

```bash
npm run prime-cache:invoke-local
```

## Local Testing

To run the test suite, run:
Expand All @@ -42,24 +111,25 @@ npm run test
## Setting up the RDF Database for local development
In order to run KMS locally, you first need to setup a RDF database.
### Prerequisites
#### Set the RDFDB user name and password
```
export RDF4J_USER_NAME=[user name]
export RDF4J_PASSWORD=[password]
```
RDF4J local defaults are in `bin/env/local_env.sh`.
If needed, override per command (for example: `RDF4J_USER_NAME=... RDF4J_PASSWORD=... npm run rdf4j:setup`).
### Building and Running the RDF Database
#### Build the docker image
```
npm run rdf4j:build
```
#### Create a docker network
```
npm run create-network
npm run rdf4j:create-network
```
#### Run the docker image
```
npm run rdf4j:start
```
#### Pull latest concepts RDF files from CMR
```
npm run rdf4j:pull
```
#### Setup and load data into the RDF database
```
npm run rdf4j:setup
Expand All @@ -71,60 +141,15 @@ npm run rdf4j:stop
```

# Deployments
## Deploying RDF Database to AWS
### Prerequisites
#### Copy your AWS credentials and set these up as env variables
```
export AWS_ACCESS_KEY_ID=${your access key id}
export AWS_SECRET_ACCESS_KEY=${your secret access key}
export AWS_SESSION_TOKEN=${your session token}
export VPC_ID={your vpc id}
```
#### Set the RDFDB user name and password
```
export RDF4J_USER_NAME=[your rdfdb user name]
export RDF4J_PASSWORD=[your rdfdb password]
export RDF4J_CONTAINER_MEMORY_LIMIT=[7168 for sit|uat, 14336 for prod]
```

#### Deploy Docker Container to Registry
```
cd cdk/rdfdb/bin
./deploy_to_ecr.sh
```

#### Deploy ECS Service to AWS
#### Deploy IAM, EBS, LB, and ECS stacks
```
export RDF4J_USER_NAME=[your rdfdb user name]
export RDF4J_PASSWORD=[your rdfdb password]
export RDF4J_CONTAINER_MEMORY_LIMIT=[7168 for sit|uat, 14336 for prod]
export RDF4J_INSTANCE_TYPE=["M5.LARGE" for sit|uat, "R5.LARGE" for prod]

cd cdk
cdk deploy rdf4jIamStack
cdk deploy rdf4jEbsStack
cdk deploy rdf4jLbStack
cdk deploy rdf4jEcsStack
cdk deploy rdf4jSnapshotStack
cdk deploy KmsStack
```
#### Alternatively, you can deploy all stacks at once
```
cd cdk
cdk deploy --all
```
One thing to note is if you destroy the rdf4jEbsStack and redeploy, this will create a new EBS file system. You will need to copy the data from the old EBS file system to the new one. This can be done by mounting the old EBS file system to an EC2 instance and copying the data to the new EBS file system.

## Deploying KMS to AWS
## Deploying KMS Application to AWS
### Prerequisites
#### Copy your AWS credentials and set these up as env variables
```
export RELEASE_VERSION=[app release version]
export bamboo_STAGE_NAME=[sit|uat|prod]
export bamboo_AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
export bamboo_AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
export bamboo_AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN}
export bamboo_LAMBDA_TIMEOUT=30
export bamboo_SUBNET_ID_A={subnet #1}
export bamboo_SUBNET_ID_B={subnet #2}
export bamboo_SUBNET_ID_C={subnet #3}
Expand All @@ -141,10 +166,13 @@ export bamboo_RDF4J_INSTANCE_TYPE=["M5.LARGE" for sit|uat, "R5.LARGE" for prod]
export bamboo_RDF_BUCKET_NAME=[name of bucket for storing archived versions]
export bamboo_EXISTING_API_ID=[api id if deploying this into an existing api gateway]
export bamboo_ROOT_RESOURCE_ID=[see CDK_MIGRATION.md for how to determine]
export bamboo_KMS_CACHE_TTL_SECONDS=[optional, default 3600]
export bamboo_KMS_CACHE_CLUSTER_SIZE_GB=[optional, default 0.5]
export bamboo_KMS_CACHE_CLUSTER_ENABLED=[optional, default true]
export bamboo_LOG_LEVEL=[INFO|DEBUG|WARN|ERROR]
export bamboo_KMS_REDIS_ENABLED=[true|false]
export bamboo_KMS_REDIS_NODE_TYPE=[for example cache.t3.micro]
```
Notes:
- If you are not deploying into an existing API Gateway, set `bamboo_EXISTING_API_ID` and `bamboo_ROOT_RESOURCE_ID` to empty strings.

#### Deploy KMS Application
```
./bin/deploy-bamboo.sh
Expand Down
5 changes: 2 additions & 3 deletions bin/deploy-bamboo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,8 @@ dockerRun() {
--env "EXISTING_API_ID=$bamboo_EXISTING_API_ID" \
--env "ROOT_RESOURCE_ID=$bamboo_ROOT_RESOURCE_ID" \
--env "LOG_LEVEL=$bamboo_LOG_LEVEL" \
--env "KMS_CACHE_TTL_SECONDS=${bamboo_KMS_CACHE_TTL_SECONDS}" \
--env "KMS_CACHE_CLUSTER_SIZE_GB=${bamboo_KMS_CACHE_CLUSTER_SIZE_GB}" \
--env "KMS_CACHE_CLUSTER_ENABLED=${bamboo_KMS_CACHE_CLUSTER_ENABLED}" \
--env "KMS_REDIS_ENABLED=$bamboo_KMS_REDIS_ENABLED" \
--env "KMS_REDIS_NODE_TYPE=$bamboo_KMS_REDIS_NODE_TYPE" \
$dockerTag "$@"
}

Expand Down
10 changes: 10 additions & 0 deletions bin/env/local_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/usr/bin/env bash

# Shared defaults for local SAM/CDK workflows.
export RDF4J_SERVICE_URL="${RDF4J_SERVICE_URL:-http://rdf4j-server:8080}"
export REDIS_ENABLED="${REDIS_ENABLED:-true}"
export REDIS_HOST="${REDIS_HOST:-kms-redis-local}"
export REDIS_PORT="${REDIS_PORT:-6379}"
export KMS_DOCKER_NETWORK="${KMS_DOCKER_NETWORK:-kms-network}"
export SAM_WARM_CONTAINERS="${SAM_WARM_CONTAINERS:-EAGER}"
export SAM_LOCAL_WATCH="${SAM_LOCAL_WATCH:-false}"
4 changes: 4 additions & 0 deletions bin/rdf4j/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/usr/bin/env bash
set -euo pipefail

docker build -t rdf4j:latest ./cdk/rdfdb/docker
15 changes: 15 additions & 0 deletions bin/rdf4j/create-network.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=bin/env/local_env.sh
source "${SCRIPT_DIR}/../env/local_env.sh"

NETWORK_NAME="${RDF4J_DOCKER_NETWORK:-${KMS_DOCKER_NETWORK:-kms-network}}"

if docker network inspect "$NETWORK_NAME" >/dev/null 2>&1; then
echo "Docker network '${NETWORK_NAME}' already exists"
exit 0
fi

docker network create "$NETWORK_NAME"
4 changes: 4 additions & 0 deletions bin/rdf4j/pull.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/usr/bin/env bash
set -euo pipefail

node setup/scripts/pullRdfFromCmr.js
18 changes: 18 additions & 0 deletions bin/rdf4j/setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=bin/env/local_env.sh
source "${SCRIPT_DIR}/../env/local_env.sh"

RDF4J_CONTAINER_NAME="${RDF4J_CONTAINER_NAME:-rdf4j-server}"
RDF4J_USER_NAME="${RDF4J_USER_NAME:-rdf4j}"
RDF4J_PASSWORD="${RDF4J_PASSWORD:-rdf4j}"
RDF4J_REPOSITORY_ID="${RDF4J_REPOSITORY_ID:-kms}"
# Setup runs from host machine against mapped port by default.
export RDF4J_SERVICE_URL="${RDF4J_SETUP_SERVICE_URL:-http://127.0.0.1:8081}"
export RDF4J_CONTAINER_NAME RDF4J_USER_NAME RDF4J_PASSWORD RDF4J_REPOSITORY_ID

echo "Running RDF4J setup against ${RDF4J_SERVICE_URL}/rdf4j-server (repo=${RDF4J_REPOSITORY_ID})"

node setup/setupRdf4j.js
11 changes: 11 additions & 0 deletions bin/rdf4j/shell.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash
set -euo pipefail

CONTAINER_NAME="${RDF4J_CONTAINER_NAME:-rdf4j-server}"

if ! docker ps -q --filter "name=^${CONTAINER_NAME}$" | grep -q .; then
echo "Container '${CONTAINER_NAME}' is not running." >&2
exit 1
fi

docker exec -it "$CONTAINER_NAME" /bin/bash
31 changes: 31 additions & 0 deletions bin/rdf4j/start.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=bin/env/local_env.sh
source "${SCRIPT_DIR}/../env/local_env.sh"

NETWORK_NAME="${RDF4J_DOCKER_NETWORK:-${KMS_DOCKER_NETWORK:-kms-network}}"
CONTAINER_NAME="${RDF4J_CONTAINER_NAME:-rdf4j-server}"
RDF4J_USER_NAME="${RDF4J_USER_NAME:-rdf4j}"
RDF4J_PASSWORD="${RDF4J_PASSWORD:-rdf4j}"
RDF4J_CONTAINER_MEMORY_LIMIT="${RDF4J_CONTAINER_MEMORY_LIMIT:-4096}"

if ! docker network inspect "$NETWORK_NAME" >/dev/null 2>&1; then
docker network create "$NETWORK_NAME" >/dev/null
fi

if docker ps -aq --filter "name=^${CONTAINER_NAME}$" | grep -q .; then
docker rm -f "$CONTAINER_NAME" >/dev/null
fi

docker run \
--name "$CONTAINER_NAME" \
--network "$NETWORK_NAME" \
-d \
-p 8081:8080 \
-e "RDF4J_USER_NAME=${RDF4J_USER_NAME}" \
-e "RDF4J_PASSWORD=${RDF4J_PASSWORD}" \
-e "RDF4J_CONTAINER_MEMORY_LIMIT=${RDF4J_CONTAINER_MEMORY_LIMIT}" \
-v logs:/usr/local/tomcat/logs \
rdf4j:latest
20 changes: 20 additions & 0 deletions bin/rdf4j/stop.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/usr/bin/env bash
set -euo pipefail

CONTAINER_NAME="${RDF4J_CONTAINER_NAME:-rdf4j-server}"

if docker ps -q --filter "name=^${CONTAINER_NAME}$" | grep -q .; then
echo "Stopping container '${CONTAINER_NAME}'..."
docker kill "$CONTAINER_NAME" >/dev/null
echo "Stopped container '${CONTAINER_NAME}'."
else
echo "Container '${CONTAINER_NAME}' is not running."
fi

if docker ps -aq --filter "name=^${CONTAINER_NAME}$" | grep -q .; then
echo "Removing container '${CONTAINER_NAME}'..."
docker rm "$CONTAINER_NAME" >/dev/null
echo "Removed container '${CONTAINER_NAME}'."
else
echo "Container '${CONTAINER_NAME}' does not exist."
fi
22 changes: 22 additions & 0 deletions bin/redis/connect.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/bin/env bash

set -euo pipefail

CONTAINER_NAME="${REDIS_CONTAINER_NAME:-kms-redis-local}"
HOST="${REDIS_CONNECT_HOST:-127.0.0.1}"
PORT="${REDIS_HOST_PORT:-6380}"

if command -v redis-cli >/dev/null 2>&1; then
echo "Connecting via host redis-cli to redis://${HOST}:${PORT}"
exec redis-cli -h "${HOST}" -p "${PORT}"
fi

if docker ps --format '{{.Names}}' | grep -qx "${CONTAINER_NAME}"; then
echo "Host redis-cli not found; connecting via docker exec to '${CONTAINER_NAME}'"
exec docker exec -it "${CONTAINER_NAME}" redis-cli
fi

echo "Could not connect to Redis."
echo "Start Redis first with: npm run redis:start"
echo "If using host redis-cli, install it or ensure '${CONTAINER_NAME}' is running for docker fallback."
exit 1
Loading