Developer Guide

Benchmarking using the Vespa Developer Cloud is easy - any Vespa Cloud application package can be deploy as-is to the Developer Cloud. Resources are auto-downscaled to one of each node - this allows a fully functional application with minimal cost. Read more.

Model serving

Use Vespa's Model Serving to automate data-driven decisions, online. Vespa supports both stateless and stateful model evaluation, where the latter is a unique Vespa feature, over a large, dynamically updated data set.

Automated Deployments

Changes to Vespa applications are safe, easily deployed in an application package, with no service interruptions - no restarts required. Invalid changes are rejected, and integration test code is deployed with the application package. Automating Vespa deployments is hence safe and easy - read more in automated-deployments.


Performance and scalability are key Vespa features. Using automated deployments makes it easy to iterate over different configurations to improve performance. This lets you easily go from a baseline to an optimized configuration, as data is auto-rebalanced between iterations. Read more.


Scrape metrics in native Vespa format or Prometheus, import to Grafana or CloudWatch. Read more.

Overload Handling

A correctly sized application handles all planned-for scenarios. Planning can however be wrong (nobody is perfect!), and extraordinary events happen. That is exactly when the application should not fail.

Depending on the event, the right action might be to add or change resources, and takes Vespa domain knowledge to evaluate.

Vespa Cloud helps configure a tradeoff between recall and resource usage, called Soft Degradation. This lets the application owner decide in advance how to simplify the query matching so results are returned, but not necessarily the best ones. Having almost as good results is better than going black. This is easiest done using Vespa Cloud and detailed performance inspection during regular load. Read more.


Autoscaling and Resource suggestions are key features helping owners optimize for cost and key goals like latency. Read more.