There are three sets of LeadOut environments:
- Sandbox - Each developer has their own sandbox environment, which they have full ownership over and can update at will.
- Dev - There is one development environment, whose purpose is for the LeadOut team and tests to see live code changes. This environment is updated in the releae pipeline, below.
- Prod - There is one Production environment, whose purpose is for users to use the product. This environment is updated in the releae pipeline, below.
Additionally, there is an auxiliary environment, which doesn't run the LeadOut application:
- Deployment - This environment runs the release pipeline.
The pipeline is triggered whenever code changes in the source repository, defined via a CodeStar Connection. Once triggered:
- Codebuild builds the client and CDK code.
- The deployments env updates itself.
- The dev environment updates.
- The pipeline waits 1 hour. During this time, devs can stop the pipeline if they see any problems in dev.
- The prod environment updates.
These are a few things not-yet-implemented, largely from the prod design doc.
If any dev alarms are firing, that is a signal there is a problem with the dev release and it should not reach prod. Currently, devs have to manually stop the release to prevent it from updating prod.
This may be done by adding a CodeBuild step to the pipeline between dev and prod, which executes a command that looks for any alarms in dev. We tried this but ran into (not-insurmountable) timeout and permissions issues.
When a release fails, it may leave the application in a partially-updated state. If the pipeline fails for any reason, it should re-deploy to the last-known-good state.
This may be done by splitting the (frequently-changing) application deployment out from the (relatively-static) infrastructure deployment:
- Infrastructure, such as CloudFront, is deployed via the pipeline, as-is. CodePipeline does not natively support (automated) rollbacks, so rollbacks would have to be done manually. Since this changes infrequently, this is acceptable.
- Application code, such as the API, is deployed via CodeDeploy with a canary step. CodeDeploy supports automated rollbacks, but can only deploy to lambdas, ECS, or EC2.
The tile server does not automatically pick up schema changes and requires a human to restart the jobs. The release pipeline should restart the tile server jobs so that it can use schema changes on startup.
This may be done by adding a CodeDeploy step that updates the tile server in ECS.