This document covers updating Chromium's instance of Monorail through Spinnaker. If you are looking to deploy your own instance of Monorail, see: Creating a new Monorail instance
Spinnaker is a platform that helps teams configure and manage application deployment pipelines. We have a ChromeOps Spinnaker instance that holds pipelines for several ChromeOps services, including Monorail.
IMPORTANT: In the event of an unexpected failure in a Spinnaker pipeline, it is extremely important that the release engineer properly cleans up versions in the Appengine console (e.g. delete bad versions and manual rollback to previous version).
Spinnaker's traffic splitting, rollback, and cleanup systems rely heavily on the assumption that the currently deployed version always has the highest version number. During Rollbacks, Spinnaker will migrate traffic back to the second highest version number and delete the highest version number. So, if the previous deployed version was v013 and there was a v014 that was created but somehow never properly deleted, and Spinnaker just created a v015, which it is now trying to rollback, this means, spinnaker will migrate 100% traffic "back" to v014, which might be a bad version, and delete v015. The same could happen for traffic migrations. If a previous, good deployment is v013 with 100% traffic, and there is a bad deployment at v014 that was never cleaned up, during a new deployment, Spinnaker will have created v015 and begin traffic splitting between v014 and v015, which means our users are either being sent to the bad version or the new version.
If you are ever unsure about how you should handle a manual cleanup and rollback, ping the Monorail chat and ask for help.
Below are brief descriptions of all the pipelines that make up the Monorail deployment process in Spinnaker.
This is the starting point of the Monorail deployment process and should be manually triggered by the Release Engineer. This pipeline handles creating a Cloud Build of Monorail. The build can be created from HEAD of a given branch or it can re-build a previous Cloud Build given a "BUILD_ID". Once the build is complete, a Deploy {Dev|Staging|Prod}
pipeline can be automatically triggered to deploy the build to an environment. On a regular weekly release, we should use the default "ENV" = dev, provide the release branch, and leave "BUILD_ID" empty.
refs/releases/monorail/[*deployment number*]
. e.g. "refs/releases/monorail/1" builds from HEAD of infra/infra/+/refs/releases/monorail/1.Deploy Dev
, Deploy Staging
, or Deploy Production
(respectively) with a successful finish of Deploy Monorail
. The "nodeploy" option means no new monorail version will get deployed to any environment.This pipeline handles deploying a new monorail-dev version and migrating traffic to the newest version.
After a new version is created, but before traffic is migrated, there is a "Continue?" stage that waits on manual judgement. The release engineer is expected to do any testing in the newest version before confirming that the pipeline should continue with traffic migration. If there are any issues, the release engineer should select "Rollback", which triggers the Rollback
pipeline. If "Continue" is selected, spinnaker will immediately migrate 100% traffic to to the newest version.
The successful finish of this pipeline triggers two pipelines: Cleanup
and Deploy Staging
.
Note that this pipeline is similar to the above Deploy Dev
pipeline. This is for Prod Tech's experimental purposes. Please ignore this pipeline. This cannot be triggered by Deploy Monorail
.
This pipeline handles deploying a new monorail-staging version and migrating traffic to the newest version.
Like Deploy Dev
after a new version is created, there is a "Continue?" stage that waits on manual judgement. The release engineer should test the new version before letting the pipeline proceed to traffic migration. If any issues are spotted, the release engineer should select "Rollback", to trigger the Rollback
pipeline.
Unlike Deploy Dev
, after "Continue" is selected, spinnaker will proceed with three separate stages of traffic splitting with a waiting period between each traffic split.
The successful finish of this pipeline triggers two pipelines: Cleanup
and Deploy Production
.
This pipeline handles deploying a new monorail-prod version and migrating traffic to the newest version.
This pipeline has the same set of stages as Deploy Staging
. the successful finish of this pipeline triggers the Cleanup
pipeline.
This pipeline handles migrating traffic back from the newest version to the previous version and deleting the newest version. This pipeline is normally triggered by the Rollback
stage of the Deploy Dev|Staging|Production
pipelines and it only handles rolling back one of the applications, not all three.
Rollback
is triggered by one of the above Deploy pipelines, the appropriate "Stack" value is passed. When the release engineer needs to manually trigger the Rollback
pipeline they should make sure they are choosing the correct "Stack" to rollback. This pipeline handles deleting the oldest version.
For more details read go/monorail-deployments and go/chrome-infra-appengine-deployments.
TODO(jojwang): Currently, notifications still need to be set up. See b/138311682
Monorail's pipelines in Spinnaker have been configured to send notifications to monorail-eng+spinnaker@google.com when:
Deploy Staging
requires manual judgement at the "Continue?" stage.Deploy Production
requires manual judgement at the "Continue?" stage.For each release cycle, a new refs/releases/monorail/[*deployment number*]
branch is created at the latest [commit SHA] that we want to be part of the deployment. Spinnaker will take the [deployment number] and deploy from HEAD of the matching branch.
Manual testing steps are added during Workflow's weekly meetings for each commit between the previous release and this release.
If any step below fails. Stop the deploy and ping Monorail chat.
git ls-remote origin "refs/releases/monorail/*"Each row will show the deployment's commit SHA followed by the branch name. The value after monorail/ is the deployment number.
git rev-parse HEAD
git checkout -b <your_release_branch_name> [*commit SHA*]
git cherry-pick -x [*cherry-picked commit SHA*]
git push origin <your_release_branch_name>:refs/releases/monorail/[*deployment number*]
tail -30 schema/alter-table-log.txtIf you don't see any changes since the last deploy, skip this section.
monorail-dev
project. Please be careful when pasting into SQL prompt.Deploy Monorail
Pipeline.Deploy Dev
, Stage: "Continue?")Deploy Staging
, Stage: "Continue?")Deploy Production
, Stage: "Continue?")git log --oneline .
(use --before
and --after
as needed).If issues are discovered after the "Continue?" stage and traffic migration has already started: Cancel the execution and manually start the Rollback
pipeline.
If issues are discovered during the monorail-staging or monorail-prod deployment DO NOT forget to also run the Rollback
pipeline for monorail-dev or monorail-dev and monorail-staging, respectively.
If you are ever unsure on how to rollback or clean up unexpected Spinnaker errors please ping the Monorail chat for help.
git checkout -b <your_release_branch_name> [*commit SHA*]
git cherry-pick -x [*cherry-picked commit SHA*]
eval `../../go/env.py`
deploy_dev
with deploy_staging
or deploy_prod
, if appropriate):make deploy_dev
ImportError: No module named six.moves
, try again after running: [commit SHA]: sudo `which python` `which pip` install six