Production zones enable serving from various locations,
with a CI/CD pipeline for safe deployments.
This guide goes through the minimal steps for a production deployment - in short:
For production deployments, at least 2 nodes are required for each
cluster to ensure availability during maintenance tasks and upgrades.
The nodes-section is also where you specify your required resources:
Also note the minimum redundancy requirement of 2:
<min-redundancy>2</min-redundancy>
Minimum resources
To help ensure a reliable service, there is a minimum resource requirement for nodes in the production environment.
The minimum is currently 0.5 VCPU, 8Gb of memory, and for disk, 2 x memory for stateless nodes, or 3 x memory for content nodes.
As the disk resource is normally the least expensive,
we recommend it should be allocated generously to ensure it does not limit the use of more expensive cpu and memory resources.
Application name
Give the deployment a name and log in:
$ vespa config set target cloud
$ vespa config set application mytenant.myapp
$ vespa auth login
The tenant name is found in the console, the application is something unique within your organization -
see tenants, applications and instances.
Add public certificate
Just as in the getting started guide,
the application package needs the public key in the security directory.
You might already have a pair, if not generate it:
$ vespa auth cert -f
Success: Certificate written to security/clients.pem
Success: Certificate written to /Users/me/.vespa/mytenant.myapp.default/data-plane-public-cert.pem
Success: Private key written to /Users/me/.vespa/mytenant.myapp.default/data-plane-private-key.pem
Observe that the files are put in $HOME/.vespa.
The content from data-plane-public-cert.pem is copied to security/clients.pem.
More details on data-plane access control permissions.
Deploy the application
Package the application and deploy it to a production zone:
$ vespa prod deploy
Find alternative deployment procedures in the next sections.
Note:
The vespa prod deploy command to prod zones,
which uses deployment.xml
differs from the vespa deploy command used for dev/perf zones - see
environments.
Endpoints
Find the 'zone' endpoint to use under Endpoints in the console.
There is an mTLS endpoint for each zone by default.
See configuring mTLS
for how to use mTLS certificates.
You can also add access tokens
in the console as an alternative to mTLS,
and specify global
and private endpoints
in deployment.xml.
Add a public certificate to security/clients.pem.
See creating a self-signed certificate
for how to create the key/cert pair, then copy the cert file to security/clients.pem.
At this point, the files are ready for deployment.
Click Create Application in the console.
Select the PROD tab.
Enter a name for the application and drop the application.zip file in the upload section.
Click Create and deploy to deploy the application to the production environment.
Production deployment with components
Deploying an application with Components
is a little different from above:
The application package root is at src/main/application.
Find the Vespa API version to compile the component.
The application package is built into a zip artifact, before deploying it.
Run the Deploy the application step.
Here, the Vespa CLI command will deploy target/application.zip built in the step above.
Next steps
Vespa Cloud takes responsibility for rolling out application changes
to all production zones as well as testing the changes first.
You will usually want to set up a job which automatically builds your application package
when changes to it are checked in, to get continuous deployment of your application.
Read automated deployments
for automation, adding CD tests and multi-zone deployments.
Once you have experience with load patterns, consider autoscaling.
To dump data from an existing Vespa instance, you can use this command with Vespa CLI:
slices=10
for slice in $(seq 0 $((slices-1))); do
vespa visit \
--slices $slices --slice-id $slice \
--target [existing Vespa instance endpoint] \
| gzip > dump.$slice.gz &
done
This dumps all the content to files, but you can also pipe the content directly into 'vespa feed'.
To feed the data:
slices=10
for slice in $(seq 0 $((slices-1))); do
zcat dump.$slice.gz | \
vespa feed \
--application <tenant>.<application>.<instance> \
--target [zone endpoint from the Vespa Console] -
done
Note that the different slices in these commands can be done in parallel on different machines.
Accessing a public cloud application from another VPC on another account
A common challenge when deploying on the public cloud, is network connectivity between workloads
running in different accounts and VPCs. Within in a team, this is often resolved by setting up
VPC peering between VPCs, but this has its challenges when coordinating between many different
teams and dynamic workloads. Vespa does not support direct VPC peering.
There are three recommended options:
Use your public endpoints, but IPv6 if you can: The default.
There are many advantages to a Zero-Trust approach and accessing your application through the public endpoint.
If you use IPv6, you will also avoid some of the network costs associated with IPv4 NATs, etc.
For some applications, this option could be cost prohibitive,
but one should not assume this is the case for all applications
with a moderate amount of data being transferred over the endpoint.
Use private endpoints via AWS PrivateLink or GCP Private Service Connect:
Vespa allows you to setup private endpoints for exclusive access from your own, co-located VPCs.
This requires less administrative overhead than general VPC peering and is also more secure.
Refer to private endpoints.
Run Vespa workloads in your own account/project (Enclave):
The Vespa Enclave feature allows you to have all your Vespa workloads run in your own account.
In this case, you can set up any required peering to open the connection into your application.
While generally available, using Vespa Cloud Enclave requires significantly more effort
from the application team in terms of operating the service,
and is only recommended for larger applications that can justify the additional work
from e.g., a security or interoperability perspective.
Refer to Vespa Enclave.