Vespa Cloud provides a set of machine-learned models that you can use in your applications. These models will always be available on Vespa Cloud and never change.
You specify to use a model provided by Vespa Cloud by setting the
attribute where you specify a model config. For example, when configuring the
provided by Vespa, you can write:
<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration"> <config name="embedding.bert-base-embedder"> <transformerModel model-id="minilm-l6-v2"/> <tokenizerVocab model-id="bert-base-uncased"/> </config> </component>
By putting this under the
<container> element in your
file, your application will have support for
suitable for production.
These models are currently available on Vespa Cloud:
|An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.|
|A larger BERT transformer model on ONNX format with higher accuracy and cost than minilm-l6-v2. It maps sentences & paragraphs to a 768 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.|
|A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models.|
These models can safely be used in production and will never be removed or changed.
You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:
<transformerModel model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>
This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.
Specifying a model-id can be done for any
config field of type
whether the config is from Vespa or defined by you.