Using machine-learned models from Vespa Cloud

When deploying on Vespa Cloud you don't need to supply your own machine-learned models but can use standard models provided by Vespa Cloud. These models will always be available on Vespa Cloud and never change.

You specify a Vespa Cloud provided model by setting the model-id attribute on the config value for that model. This can be done with any config field of type model, whether the config is from Vespa or defined by you:

<aModelConfigValue model-id="minilm-l6-v2"/>

See the complete example below for more.

You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:

<aModelConfigValue model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>

This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.

Available models

These are the models available on Vespa Cloud:

minilm-l6-v2
An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.
Licenseapache-2.0
Sourcehttps://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
bert-base-uncased
A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models.
Licenseapache-2.0
Sourcehttps://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

Models in this table can safely be used in production and will never be removed or changed.

Example config with model ids

Here is an example of configuring the Vespa-provided Bert embedder with models supplied by Vespa Cloud:

<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration">
    <config name="embedding.bert-base-embedder">
        <transformerModel model-id="minilm-l6-v2"/>
        <tokenizerVocab model-id="bert-base-uncased"/>
    </config>
</component>

By putting this under the <container> element in your services.xml file, your application will have support for text embedding suitable for production.