Routing and endpoints

Vespa Cloud supports multiple methods of routing requests to your application. This page describes how these routing methods work and how to configure them.

By default, each deployment of your Vespa application will have a zone endpoint. In addition to the default zone endpoint, you can configure global endpoints and application endpoints.

All endpoints for your application are available under the endpoints tab of each deployment in the console.

Endpoint types

Zone endpoint

This is the default endpoint for a deployment. Requests through a zone endpoint are sent directly to the zone.

Zone endpoints are created implicitly, one per container cluster declared in services.xml. Zone endpoints are not configurable.

The format of zone endpoints is {service}.{instance}.{application}.{tenant}.{zone}.z.vespa-app.cloud.

service The name of the Vespa service as defined in services.xml container id attribute. This is omitted if the value is default.
instance The name of the Vespa application instance as defined in deployment.xml. This is omitted if the value is default.
application The name of your application.
tenant The name of your tenant.
zone The zone you deployed your application in, as defined in deployment.xml.

Global endpoint

A global endpoint is an endpoint that can route requests to multiple zones. It can be configured in in deployment.xml. Similar to how a CDN works, requests through this endpoint will be routed to the nearest zone based on geo proximity, i.e. the zone that is nearest to the client.

The format of global endpoints is {endpoint-id}.{instance}.{application}.{tenant}.g.vespa-app.cloud

Global endpoints do not support feeding. Feeding must be done through zone endpoints.

endpoint-id The endpoint ID as defined in deployment.xml. This is omitted if the value is default.
instance The name of the Vespa application instance as defined in deployment.xml. This is omitted if the value is default.
application The name of your application.
tenant The name of your tenant.

Application endpoint

An application endpoint is an endpoint that can span multiple instances of your application (but not multiple zones). It can be configured in in deployment.xml.

Each instance present in the endpoint is assigned a configurable weight. Requests through the endpoint are then distributed across instances according to the relative weight. The portion of traffic an instance receives when queried through an application endpoint can be calculated as follows: instance weight / sum of all weights

The format of application endpoints is {endpoint-id}.{application}.{tenant}.r.vespa-app.cloud

Application endpoints do not support feeding. Feeding must be done through zone endpoints.

endpoint-id The endpoint ID as defined in deployment.xml. This is omitted if the value is default.
application The name of your application.
tenant The name of your tenant.

Routing control

Vespa Cloud has two mechanisms for manually controlling routing of requests to a zone:

  • Changing the active attribute on the <region> element in deployment.xml and submitting a new version of your application.
  • Changing the status through the console.

This section describes the latter mechanism. Navigate to the relevant deployment of your application in the console. Hovering over the GLOBAL ROUTING badge will display the current status and when it was last changed.

Change status

In case of a production emergency, a zone can be manually set out to prevent it from receiving requests:

Hover over the GLOBAL ROUTING badge for the problematic deployment and click Deactivate.

Inspection of the status will now show the status set to OUT. To set the zone back in and have it continue receiving requests: Hover over the GLOBAL ROUTING badge again and click Activate.

Behaviour

Changing the routing status is independent of the endpoint type used. You're technically overriding the routing status the deployment reports to the Vespa Cloud routing infrastructure. This means that a change to routing status affects both global endpoints and application endpoints.

Deactivating a deployment disables routing of requests to that deployment, both through global endpoints and application endpoints until the deployment is activated again. As routing through these endpoints is DNS-based, it may take up between 5 and 15 minutes for all traffic to shift to other deployments.

If all deployments of an endpoint are deactivated, requests are distributed as if all deployments were active. This is because attempting to route traffic according to the original configuration is preferable to discarding all requests.

Migrating from deprecated syntax for global endpoints

Global endpoints were initially configured by setting the global-service-id attribute on the <prod> element in deployment.xml. To control which deployments were part of your global endpoint, you had set the active attribute on the <region> element.

These attributes have since been replaced by the <endpoints> element. The legacy syntax is still supported, but migrating to the new syntax is encouraged. Migration from deprecated syntax to the current syntax is seamless and does not cause service disruption.

The following example shows how to move from the deprecated syntax to the current one:

<!-- Deprecated variant -->
<deployment version="1.0">
  <prod global-service-id="query">
    <region active="true">aws-us-east-1c</region>
    <region active="true">aws-us-west-2a</region>
    <region active="false">aws-eu-west-1a</region>
  </prod>
</deployment>
<!-- Current variant -->
<deployment version="1.0">
  <prod>
    <region>aws-us-east-1c</region>
    <region>aws-us-west-2a</region>
    <region>aws-eu-west-1a</region>
  </prod>
  <endpoints>
    <endpoint container-id="query"/>
      <region>aws-us-east-1c</region>
      <region>aws-us-west-2a</region>
    <endpoint/>
  </endpoints>
</deployment>