Keyword Management System (KMS) is a application for maintaining keywords (science keywords, platforms, instruments, data centers, locations, projects, services, resolution, etc.) in the earthdata/IDN system.
To install the necessary components, run:
npm install
Prerequisites:
- Docker
- aws-sam-cli (
brew install aws-sam-cli)
To start local server (including rdf4j database server, cdk synth and sam) First, make sure to start LocalStack:
npm run localstack:start
Then, you can start the local server:
npm run start-local
To run local server with SAM watch mode enabled
npm run start-local:watch
Local development intentionally splits responsibilities between SAM and LocalStack:
- SAM runs the API Gateway and Lambda side of KMS locally.
- LocalStack emulates AWS-managed services that SAM does not model end-to-end for this repo, especially SNS and SQS.
- RDF4J and Redis remain separate local services because they are not AWS services.
We do not run the entire application stack inside LocalStack because the existing SAM flow is simpler for day-to-day Lambda/API development, while LocalStack is most useful here for the managed messaging pieces. For keyword event processing, npm run start-local also starts [scripts/local/run_localstack_cmr_keyword_events_bridge.js], which polls the LocalStack SQS queue and forwards messages into the local CMR consumer handler. This bridge exists because sam local start-api does not emulate SQS event source mappings the way AWS does in deployed environments.
After deploying to SIT, you can exercise the keyword event publisher with:
curl -H "Authorization: Bearer $TOKEN" -X POST https://cmr.sit.earthdata.nasa.gov/kms/keyword-events/test \
-H 'Content-Type: application/json' \
-d '{
"EventType": "UPDATED",
"Scheme": "sciencekeywords",
"UUID": "4f81c61c-f100-4bc4-9664-d9b70d2f162f",
"OldKeywordPath": "Instruments > Solar/Space Observing Instruments > Passive Remote Sensing",
"NewKeywordPath": "Instruments > Earth Remote Sensing Instruments > Passive Remote Sensing",
"Timestamp": "2026-03-19T10:41:57.720Z",
"MetadataSpecification": {
"Name": "Kms-Keyword-Event",
"URL": "https://cdn.earthdata.nasa.gov/kms-keyword-event/v1.0",
"Version": "1.0"
}
}'Expected result:
- the API returns
200 - the response includes the SNS topic ARN and message id
- the CMR event processor is invoked from the subscribed queue in AWS
EventTypemust be one ofINSERTED,UPDATED, orDELETED
By default, local start-local does not provision Redis in CDK. You can still test Redis caching by running Redis in Docker.
Local defaults are centralized in bin/env/local_env.sh.
- Ensure the docker network exists:
npm run rdf4j:create-network
- Start Redis on the same docker network used by SAM:
npm run redis:start
- (Optional) override defaults in
bin/env/local_env.shor per-command, for example:
REDIS_ENABLED=true REDIS_HOST=kms-redis-local REDIS_PORT=6379 npm run start-local
- Start local API:
npm run start-local
- Verify cache behavior (published reads only):
GET /concepts?version=publishedGET /concepts/concept_scheme/{scheme}?version=published
- Check Redis cache memory usage:
npm run redis:memory_used
Common burstable node types for Redis/Valkey:
| Node type | Memory (GiB) |
|---|---|
cache.t4g.micro |
0.5 |
cache.t4g.small |
1.37 |
cache.t4g.medium |
3.09 |
cache.t3.micro |
0.5 |
cache.t3.small |
1.37 |
cache.t3.medium |
3.09 |
For full and latest node-family capacities (m/r/t and region support), see: https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/CacheNodes.SupportedTypes.html
To disable local Redis cache again:
REDIS_ENABLED=false npm run start-local
Runs the same Lambda used by the scheduled EventBridge cache-prime target, locally via SAM.
The script re-synthesizes cdk/cdk.out/KmsStack.template.json each run so local Redis env settings are baked into the template.
npm run prime-cache:invoke-localTo run the test suite, run:
npm run test
In order to run KMS locally, you first need to setup a RDF database.
RDF4J local defaults are in bin/env/local_env.sh.
If needed, override per command (for example: RDF4J_USER_NAME=... RDF4J_PASSWORD=... npm run rdf4j:setup).
npm run rdf4j:build
npm run rdf4j:create-network
npm run rdf4j:start
npm run rdf4j:pull
npm run rdf4j:setup
npm run rdf4j:stop
export RELEASE_VERSION=[app release version]
export bamboo_STAGE_NAME=[sit|uat|prod]
export bamboo_AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
export bamboo_AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
export bamboo_AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN}
export bamboo_SUBNET_ID_A={subnet #1}
export bamboo_SUBNET_ID_B={subnet #2}
export bamboo_SUBNET_ID_C={subnet #3}
export bamboo_VPC_ID={your vpc id}
export bamboo_RDF4J_USER_NAME=[your rdfdb user name]
export bamboo_RDF4J_PASSWORD=[your rdfdb password]
export bamboo_EDL_HOST=[edl host name]
export bamboo_EDL_UID=[edl user id]
export bamboo_EDL_PASSWORD=[edl password]
export bamboo_CMR_BASE_URL=[cmr base url]
export bamboo_CORS_ORIGIN=[comma separated list of cors origins]
export bamboo_RDF4J_CONTAINER_MEMORY_LIMIT=[7168 for sit|uat, 14336 for prod]
export bamboo_RDF4J_INSTANCE_TYPE=["M5.LARGE" for sit|uat, "R5.LARGE" for prod]
export bamboo_RDF_BUCKET_NAME=[name of bucket for storing archived versions]
export bamboo_EXISTING_API_ID=[api id if deploying this into an existing api gateway]
export bamboo_ROOT_RESOURCE_ID=[see CDK_MIGRATION.md for how to determine]
export bamboo_LOG_LEVEL=[INFO|DEBUG|WARN|ERROR]
export bamboo_KMS_REDIS_ENABLED=[true|false]
export bamboo_KMS_REDIS_NODE_TYPE=[for example cache.t3.micro]
Notes:
- If you are not deploying into an existing API Gateway, set
bamboo_EXISTING_API_IDandbamboo_ROOT_RESOURCE_IDto empty strings.
./bin/deploy-bamboo.sh