Using machine-learned models from Vespa Cloud

When deploying on Vespa Cloud you don't need to supply your own machine-learned models but can use standard models provided by Vespa Cloud. These models will always be available on Vespa Cloud and never change.

You specify a Vespa Cloud provided model by setting the model-id attribute on the config value for that model. This can be done with any config field of type model, whether the config is from Vespa or defined by you:

<aModelConfigValue model-id="minilm-l6-v2"/>

See the complete example below for more.

You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:

<aModelConfigValue model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>

This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.

Available models

These are the models available on Vespa Cloud:

An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.
A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models.

Models in this table can safely be used in production and will never be removed or changed.

Example config with model ids

Here is an example of configuring the Vespa-provided Bert embedder with models supplied by Vespa Cloud:

<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration">
    <config name="embedding.bert-base-embedder">
        <transformerModel model-id="minilm-l6-v2"/>
        <tokenizerVocab model-id="bert-base-uncased"/>

By putting this under the <container> element in your services.xml file, your application will have support for text embedding suitable for production.