Using machine-learned models from Vespa Cloud

Vespa Cloud provides a set of machine-learned models that you can use in your applications. These models will always be available on Vespa Cloud and never change.

You specify to use a model provided by Vespa Cloud by setting the model-id attribute where you specify a model config. For example, when configuring the Bert embedder provided by Vespa, you can write:

<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration">
    <config name="embedding.bert-base-embedder">
        <transformerModel model-id="minilm-l6-v2"/>
        <tokenizerVocab model-id="bert-base-uncased"/>
    </config>
</component>

By putting this under the <container> element in your services.xml file, your application will have support for text embedding suitable for production.

Available models

These models are currently available on Vespa Cloud:

minilm-l6-v2
An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.
Licenseapache-2.0
Sourcehttps://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
mpnet-base-v2
A larger BERT transformer model on ONNX format with higher accuracy and cost than minilm-l6-v2. It maps sentences & paragraphs to a 768 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.
Licenseapache-2.0
Sourcehttps://huggingface.co/sentence-transformers/all-mpnet-base-v2
bert-base-uncased
A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models.
Licenseapache-2.0
Sourcehttps://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

These models can safely be used in production and will never be removed or changed.

Creating applications working both self-hosted and on Vespa Cloud

You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:

<transformerModel model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>

This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.

Using Vespa Cloud models with any config

Specifying a model-id can be done for any config field of type model, whether the config is from Vespa or defined by you.