services.xml: Vespa Cloud

services.xml is the primary Vespa configuration file. This documents services.xml amendments for Vespa Cloud - see services.xml at for the general reference.


In cloud applications nodes are specified by count and node resources. Example:

<nodes count="4">
  <resources vcpu="2.5" memory="32Gb" disk="1Tb"/>

Subelements: resources


  • count: An integer or range. The number of nodes of the cluster.
  • exclusive (optional): true or false (default false). If true, the nodes of this cluster will be placed on hosts shared only with other nodes of the same application. This is useful for container clusters in applications storing sensitive data or secrets as it adds another layer of protection against leaking sensitive information between applications sharing host.

In addition there are some attributes for specific cluster types, listed below.

<nodes> for <container>

  • of (optional): A content cluster id. This attribute can be used to specify that this container service should run on the nodes of a content service. If of is specified, only jvmargs can be set in addition. No other attributes can be set, as the ones of the referenced content cluster will be used instead.

<nodes> for <content>

  • groups (optional): Integer or range. Sets the number of groups into which content nodes should be divided. Each group will have an equal share of the nodes and redundancy copies of the corpus, and each query will be routed to just one group. This allows scaling to a higher query load than within a single group.

    Note that in Vespa Cloud, redundancy means replicas per group. In Open Source Vespa, redundancy is total number of replicas across groups - details.

<nodes> for <controllers> <slobroks> and <logservers>

The nodes element nested in these elements allow specifying whether the nodes used should be dedicated to the service or if it should run on existing nodes. Attribute:

  • dedicated (optional): true or false (default false). Whether or not separate nodes should be allocated for this service.


Contained in the nodes element, specifies each node's resource requirements. Resources is a powerful way to optimize cost and performance. For new launches, allocate enough to reduce risk - then use the performance guides to find the sweet spot to balance cost and free capacity, based on real production load. Migration to new capacity is automated, read more in elastic Vespa.

Subelements: None


vcpufloat or range CPU, virtual threads
memoryfloat or range, each followed by a byte unit, such as "Gb" Memory
diskfloat or range, each followed by a byte unit, such as "Gb" Disk space
storage-type (optional)string (enum) The type of storage to use. This is useful to specify local storage when network storage provides insufficient io operations or too noisy io performance.
localNode-local storage is required.
remoteNetwork storage must be used.
any (default)Both remote or local storage may be used.
disk-speed (optional)string (enum) The required disk speed category.
fast (default)SSD-like disk speed is required
slowThis is sized for spinning disk speed
any Performance does not depend on disk speed (often suitable for container clusters).

Autoscaling ranges

Resources specified as a range will be autoscaled by the system. Ranges are expressed by the syntax [lower-limit, upper-limit]. Both limits are inclusive.

Autoscaling will attempt to keep utilization of all allocated resources close to ideal, and will automatically reconfigure to the cheapest option allowed by the ranges when necessary.

The ideal utilization takes into account that a node may be down or failing, that another region may be down causing doubling of traffic, and that we need headroom for maintenance operations and handling requests with low latency. It acts on what it has observed on your system in the recent past. If you need much more capacity in the near future than you do currently, you may want to set the lower limit to take this into account. Upper limits should be set to the maximum size that makes business sense.

When a new cluster (or application) is deployed it will initially be configured with the minimal resources given by the ranges. When autoscaling is turned on for an existing cluster, it will continue unchanged until autoscaling determines that a change is beneficial.


Autoscaling node count:

<nodes count="[4, 8]">
  <resources vcpu="2.5" memory="32Gb" disk="100Gb" disk-speed="any"/>

Autoscaling on all resources:

<nodes count="[4, 8]" groups="[1, 2]">
  <resources vcpu="[2.5, 8]" memory="[32Gb, 150Gb]" disk="[100Gb, 1Tb]"/>

<admin version="4.0">

Admin version 4 is used for explicit control over the number of admin services running and whether these should run on dedicated nodes or on some existing container cluster in the application. In most cases, there is no need to specify this explicitly.


<slobroks><nodes .../> 0 or 1 Controls the nodes used for instance internal service location brokering
<logservers><nodes dedicated='true' .../> 0 or 1 Controls the node used as log server. At most 1 node can be configured.


Under content provides control over the nodes used as cluster controllers in this cluster on Vespa Cloud. If this element is not specified, 3 nodes from the content cluster are assigned as cluster controllers (from different groups if applicable). If this is specified, there is one mandatory sub-element, nodes.

Instance, environment and region variants

Application packages support defining different configuration settings for different instances, environments and regions. To use this you must import the deploy namespace:

<services version="1.0" xmlns:deploy="vespa">
Deploy directives are used to specify in which instance, environment and/or region an XML element should be included:
<content version="1.0">
    <document type="" mode="index" />
  <nodes deploy:environment="dev" count="1" />
  <nodes deploy:environment="prod" deploy:region="aws-us-east-1c" count="20" />
  <nodes deploy:environment="prod" deploy:region="aws-ap-northeast-1a" count="40" />
  <nodes deploy:environment="prod" deploy:region="aws-ap-northeast-1a" deploy:instance="alpha" count="4" />
This example configures a different node count depending on the deployment target. Deploying the application in the dev environment gives:
<content version="1.0">
    <document type="" mode="index" />
  <nodes count="1" />
Whereas in it is:
<content version="1.0">
    <document type="" mode="index" />
  <nodes count="60" />

This can be used to modify any config by deployment target.

The deploy directives have a set of override rules:

  • A directive specifying more conditions will override one specifying fewer.
  • Directives are inherited in child elements.
  • When multiple XML elements with the same name is specified (e.g. when specifying search or docproc chains), the id attribute of the element is used together with the element name when applying directives.

Some overrides are applied by default in some environments, see environments. Any override made explicitly for an environment will override the defaults for it.

Specifying multiple targets

More than one instance, region or environment can be specified in the attribute, separated by space. Notes:

  • The region attribute is only respected if given environment exists in multiple regions. This is currently true for prod and dev
  • An element which only specifies region, will match both prod and dev environment in that region

The namespace can be applied to any element. Example:

<container id="default" version="1.0" deploy:environment="perf test staging prod">
    <chain id="default" inherits="vespa">
      <searcher bundle="basic-application" id="">
        <config name="example.message">
          <message>Hello from application config</message>
          <message deploy:region="aws-us-east-1c">Hello from east colo!</message>

Above, the container element is configured for the 4 environments only (it will not apply to dev) - and in region aws-us-east-1c, the config is different.