Production zones enable serving from various locations, with a CI/CD pipeline for safe deployments. This guide goes through the minimal steps for a production deployment - in short:
The sample application used in getting started is a good basis for these steps, see source files.
Read migrating to Vespa Cloud first, as a primer on deployment and endpoint usage.
There are alternative ways of deploying at the end of this guide, too.
Add a <prod>
element to deployment.xml:
If deployment.xml does not exist, add it to the application package root (next to services.xml).
Modify services.xml - minimal example:
For production deployments, at least 2 nodes are required for each
cluster to ensure availability during maintenance tasks and upgrades.
The nodes
-section is also where you specify your required resources:
Also note the minimum redundancy requirement of 2:
To help ensure a reliable service, there is a minimum resource requirement for nodes in the production environment. The minimum is currently 0.5 VCPU, 8Gb of memory, and for disk, 2 x memory for stateless nodes, or 3 x memory for content nodes. As the disk resource is normally the least expensive, we recommend it should be allocated generously to ensure it does not limit the use of more expensive cpu and memory resources.
Give the deployment a name and log in:
$ vespa config set target cloud $ vespa config set application mytenant.myapp $ vespa auth login
The tenant name is found in the console, the application is something unique within your organization - see tenants, applications and instances.
Just as in the getting started guide, the application package needs the public key in the security directory. You might already have a pair, if not generate it:
$ vespa auth cert -f Success: Certificate written to security/clients.pem Success: Certificate written to /Users/me/.vespa/mytenant.myapp.default/data-plane-public-cert.pem Success: Private key written to /Users/me/.vespa/mytenant.myapp.default/data-plane-private-key.pem
Observe that the files are put in $HOME/.vespa. The content from data-plane-public-cert.pem is copied to security/clients.pem. More details on data-plane access control permissions.
Package the application and deploy it to a production zone:
$ vespa prod deploy
Find alternative deployment procedures in the next sections.
vespa prod deploy
command to prod zones,
which uses deployment.xml
differs from the vespa deploy
command used for dev/perf zones - see
environments.
Find the 'zone' endpoint to use under Endpoints in the console. There is an mTLS endpoint for each zone by default. See configuring mTLS for how to use mTLS certificates.
You can also add access tokens in the console as an alternative to mTLS, and specify global and private endpoints in deployment.xml.
Write data efficiently using the document/v1 API using HTTP/2, or with the Vespa CLI. There is also a Java library.
To feed data from a self-hosted Vespa into a new cloud instances, see the appendix or cloning applications and data.
Also see the http best practices documentation.
Use deploy-vector-search.yaml as a starting point, and see Automating with GitHub Actions for more information.
Instead of using the Vespa CLI, one can build an application package for production deployment using zip only:
$ zip -r application.zip . \ -x application.zip "ext/*" README.md .gitignore ".idea/*"
Deploying an application with Components is a little different from above:
See Getting started java for prerequisites. Procedure:
$ mvn vespa:compileVersion \ -Dtenant=mytenant \ -Dapplication=myapp
$ mvn -U package -Dvespa.compile.version="$(cat target/vespa.compile.version)"
To dump data from an existing Vespa instance, you can use this command with Vespa CLI:
slices=10 for slice in $(seq 0 $((slices-1))); do vespa visit \ --slices $slices --slice-id $slice \ --target [existing Vespa instance endpoint] \ | gzip > dump.$slice.gz & done
This dumps all the content to files, but you can also pipe the content directly into 'vespa feed'.
To feed the data:
slices=10 for slice in $(seq 0 $((slices-1))); do zcat dump.$slice.gz | \ vespa feed \ --application <tenant>.<application>.<instance> \ --target [zone endpoint from the Vespa Console] - done
Note that the different slices in these commands can be done in parallel on different machines.
A common challenge when deploying on the public cloud, is network connectivity between workloads running in different accounts and VPCs. Within in a team, this is often resolved by setting up VPC peering between VPCs, but this has its challenges when coordinating between many different teams and dynamic workloads. Vespa does not support direct VPC peering.
There are three recommended options: