Automated Deployments

This document explains the process — and safety mechanisms — which allows application changes and Vespa platform releases to be continuously deployed to production.

Each application package build which is submitted to Vespa Cloud constitutes an application change which must be tested and, if found healthy, deployed. Similarly, each change to the Vespa platform, by the Vespa team, must be tested and deployed for all the hosted applications. Vespa Cloud automates all these tests and deployments, with features including:

  • chained runs of tests and deployment, with retries of failed jobs;
  • multiple concurrent instances of an application in each zone, upgraded as specified by the user, for testing application changes in a subset of the service before rolling them out further;
  • separation of application and platform changes, making it easier to pinpoint breaking changes (application changes are always allowed when an upgrade fails, as they may be necessary to fix the breakage);
  • cancellation of any current application roll-out, upon submission of a new application revision;
  • throttling of platform upgrades, to detect unhealthy upgrades with a subset of applications; and
  • cancellation of platform upgrades which are found unhealthy, across all applications.
With Continuous Integration (CI) that builds and submits changes to the application as they are committed, Vespa Cloud thus provides full-fledged Continuous Deployment (CD) of all its applications, both for application developers, and for the Vespa team.

Setting up deployment to production

Follow steps in Getting to Production. Also see the API reference. Summary:

  1. Create deployment.xml

    Configure where and when to deploy — supported zones.

  2. Create system and staging tests

    Create at least one system and one staging test.

  3. Set up a deployment job

    Configure a job in a tool like Github Actions - see Continuous Deployment

Once set up, make changes to the production application simply by checking in the change to the application source repository.

Continuous Deployment

In the continuous build tool, set up a job which builds the Vespa application and ships it to the Vespa cloud. Trigger this job by a merge to the main branch in the source control system, where the Vespa application repository is stored.

The job should execute something like the following (modify as needed if not using git):

$ mvn clean vespa:compileVersion \
  -DapiKey="${MY_API_KEY}"

$ mvn -P fat-test-application \
  package vespa:submit \
  -Dvespa.compile.version="$(cat target/vespa.compile.version)"  \
  -Drepository=$(git config --get remote.origin.url) \
  -Dbranch=$(git rev-parse --abbrev-ref HEAD) \
  -Dcommit=$(git rev-parse HEAD) \
  -DauthorEmail=$(git log -1 --format=%aE) \
  -DapiKey="${MY_API_KEY}"
Track deployment at https://console.vespa.oath.cloud/tenant/mytenant/application/myapp/deployment - click "Deployment" in the console / refresh page.

Keys

Deployment jobs must have access to the Application API key. See the GitHub workflow deploy-vespa-documentation-search.yaml for an example - the API key is stored as a secret in the repository.

Some services like Travis CI do not accept multi-line values for Environment Variables in Settings. A workaround is to use the output of

$ openssl base64 -A -a < mykey.pem && echo
in a variable, say VESPA_MYAPP_API_KEY, in Travis Settings. VESPA_MYAPP_API_KEY is exported in the Travis environment, example output:
Setting environment variables from repository settings
$ export VESPA_MYAPP_API_KEY=[secure]
Then, before deploying/submitting to Vespa Cloud, regenerate the key value:
MY_API_KEY=`echo $VESPA_MYAPP_API_KEY | openssl base64 -A -a -d`
and use "${MY_API_KEY}" in the deploy/submit command.

vespa:compileVersion

Vespa Cloud is backwards compatible on major versions. Meaning, code compiled with an older version of Vespa APIs can always be deployed to Vespa Cloud on same major version. However, if the application package is compiled with a newer API version, then deployed to an older version serving, deployment can fail.

This is normally not a problem as Vespa Cloud upgrades daily.

To make sure forward compatibility is not an issue, vespa:compileVersion returns the lowest version running in production for the application. This version is then set in vespa.compile.version when building the application package.

Deployment orchestration

Vespa applications are compiled against one version of the Vespa Java artifacts, and then deployed to nodes in the cloud where the runtime Vespa version is controlled by the system. This runtime, or platform, version is also continuously updated, independently of application updates. This leads to a number of possible combinations of application packages and platform versions for each application.

Instead of a simple pipeline, Vespa deployments are orchestrated such that any deployment of an application package X to a production cluster with platform version Y is preceded by system and staging tests using the same version pair; and likewise for any upgrade of the platform to version Y of a production cluster running an application package X. Good system and staging tests therefore guard against both unfortunate changes in the application, and in the Vespa platform.

System and staging tests are mandatory; see below for how to write them.

When an application or platform change has been successfully verified in a system and staging tests, it is deployed to a production zone. This deployment job may also contain verification tests that need to succeed before the change rolls on to more zones. Good production tests fail if a change is deployed in production which impacts the observed behavior of the application negatively, typically by asserting on application metrics after a delay. If the application is deployed in multiple prod zones, this makes it possible to revert to the old version quickly by shifting traffic to another production zone.

Status of ongoing tests and deployments is found by clicking Deployment in the application view in the console. Examples of advanced deployment configuration which can be set in deployment.xml include:

  • Deployment order and parallelism
  • Time windows with no deployments
  • Grace periods between deployments, and before their tests

Production deployments

Production jobs run sequentially by default, but can be configured to run in parallel, in deployment.xml; inside each zone, Vespa itself orchestrates the deployment, such that the application may continue to serve, even as subsets of its nodes are down for upgrade. A production deployment job is not complete before the upgrade is complete on all nodes, and the cluster has returned to a stable state. When the Vespa platform is upgraded, each node has to restart with the new runtime; this is typically slower than an application change by the user, which often amounts only to a reconfiguration of smaller parts of the deployment.

Deleting an application

  1. Remove all instances in deployment.xml, then run the CI job. Details.
  2. Delete the application in the console.
  3. Delete the CI job that builds and pushes new artifacts.

Feature switches and bucket tests

With CD, it is not possible to hold off releasing a feature until it is done, test it manually until convinced it works and then release it to production. What to do instead? The answer is feature switches: release new features to production as they are developed, but include logic which keeps them deactivated until they are ready, or until they have been verified in production with a subset of users.

Bucket tests is the practice of systematically testing new features or behavior for a controlled subset of users. This is common practice when releasing new science models, as they are difficult to verify in test, but can also be used for other features.

To test new behavior in Vespa, use a combination of search chains and rank profiles, controlled by query profiles, where one query profiles correspond to one bucket. These features support inheritance to make it easy to express variation without repetition.

Some times a new feature require incompatible changes to a data field. To be able to CD such changes, it is necessary to create a new field containing the new version of the data. This costs extra resources but less than the alternative: standing up a new system copy with the new data. New fields can be added and populated while the system is live.

It should be mentioned that the need for incompatible changes can be decreased by making the semantics of the fields more precise. E.g., if a field is defined as the "quality" of a document, where a higher number means higher quality, a new algorithm which produces a different range and distribution will typically be an incompatible change. However, if the field is defined more precisely as the average time spent on the document once it is clicked, then a new algorithm which produces better estimates of this value will not be an incompatible change. Using precise semantics also have the advantage of making it easier to understand if the use of the data and its statistical properties are reasonable.

Integration testing

Another challenge with CD is integration testing across multiple services: another service depends on this Vespa application for its own integration testing. There are two ways to provide this: Either create an additional application instance for testing or use test data in the production instance. Using test data in production requires that some thought is given to separating this data from the real data in queries. A separate instance gives complete isolation, but with some additional overhead, and may not produce quite as realistic testing of queries, as those will run only over the test data in the separate instance.