Migrating a Vespa application to Vespa Cloud is straightforward, as applications on Vespa Cloud supports all the same features as your self-hosted Vespa instances, you’re just gaining some new capabilities and avoid the operational work.
The high-level process is as follows:
The rest of this guide assumes you have a tenant ready for deployment:
$ export VESPA_TENANT_NAME=mytenant
An application package from a self-hosted system
can be deployed as-is to the Vespa Cloud dev
environment.
It is downscaled to minimum resource usage, it will support a low level of documents and queries.
The root of an application package might look at this:
├── schemas
│ └── doc.sd
└── services.xml
There are often more files, the above is a minimum. This is the root of the application package - make this the current working directory. Make sure the Vespa CLI is installed, see pre-requisites above:
$ vespa
Usage:
vespa [flags]
vespa [command]
Configure the local environment and log in to Vespa Cloud:
$ vespa config set target cloud && \
vespa config set application $VESPA_TENANT_NAME.myapp && \
vespa auth login
Create and add security credentials:
$ vespa auth cert
This will add the security
directory to the application package,
and add a public certificate to it:
├── schemas
│ └── doc.sd
├── security
│ └── clients.pem
└── services.xml
The command also installs a key/certificate pair in the Vespa CLI home directory, see vespa auth cert. This pair is used in subsequent accesses to the data plane for document and query operations.
<deployment version="1.0" cloud-account="gcp:project-name">
<dev />
</deployment>
<deployment version="1.0" cloud-account="aws:123456789012">
<dev />
</deployment>
├── deployment.xml
├── schemas
│ └── doc.sd
├── security
│ └── clients.pem
└── services.xml
At this point, the local environment and the application package is ready for deployment:
$ vespa deploy --wait 600
Please note that a first-time deployment normally takes a few minutes, as resources are provisioned.
At this point, we recommend opening the console to observe the deployed application.
The link will be https://console.vespa-cloud.com/tenant/mytenant/application/myapp/dev/instance/default
(replace with your own names) - this is also easily found in the console main page:
Refer to vespa8 release notes for troubleshooting in case the deployments fails, based on a Vespa 7 (or earlier) version.
After a successful deployment, it is a good idea to trim the application package and make it ready for subsequent deployment to other Vespa Cloud environments.
hosts.xml
is not used on Vespa Cloud, remove it.
Edit <nodes>
configuration in services.xml
- from:
<container id="default" version="1.0">
<document-api/>
<document-processing/>
<search/>
<nodes>
<node hostalias="node4" />
<node hostalias="node5" />
</nodes>
</container>
to:
<container id="default" version="1.0">
<document-api/>
<document-processing/>
<search/>
<nodes count="2">
<resources vcpu="2" memory="8Gb" disk="50Gb"/>
</nodes>
</container>
In short, this is the Vespa Cloud syntax for resource specifications.
Example, migrating from a cluster using c7i.2xlarge
instance type,
with a 200G disk - from the AWS specifications:
c7i.2xlarge 8 16 EBS-Only
Equivalent Vespa Cloud configuration:
<resources vcpu="8" memory="16Gb" disk="200Gb"/>
Repeat this for all clusters in services.xml
and deploy again to validate the configuration. Notes:
dev
environment, what is actually deployed is a minimized version.
The configuration changes above are easily tested in this environment.count=2
is best practise at this point.The endpoints are shown in the console, one can also list them like:
$ vespa status query
Container default at https://aa1c1234.b225678e.z.vespa-app.cloud/ is ready
Test the query endpoint, expect totalCount: 0
:
$ vespa query 'select * from sources * where true'
{
"root": {
"id": "toplevel",
"relevance": 1.0,
"fields": {
"totalCount": 0
},
In the services.xml
examples at the start of this guide,
both <search>
and <document>
and configured in the same cluster, named default
.
In case of multiple container clusters, select the one configured with <search>
:
vespa query 'select * from sources * where true' --cluster myquerycluster
Finally, feed a document to the cluster (this is the cluster configured with <document>
)
vespa feed mydoc.jsonl --cluster myfeedcluster
Redo the query and observe nonzero totalCount
.
This is the final step in the functional validation. Feed (a subset) of the data and validate that queries and other API accesses work as expected.
At the end of the validation process, continue to production deployment to set up in production zones.